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(57) Abstract: Many Gram-negadve 
pathogens assemble adhesive structures 
on their sarfytces that allow them to 
colonize host tissues and cause disease. 
Novel compositions fcr the j^evoition or 
Inhibition of pflns assembly in Gram-native 
pathogens are disclosed. Interacting with tiie 
binding site of pill subunits will amatively 
a£fect the chaperone/usher pathway which 
is one molecular mechanism by which 
Gram-n^ative bacteria assemble adhesive 
|Hli stnicnues and thus prevent or inhibit pDus 
assembly. Additionally, novel compounds 
and compositions for interfering or preventing 
adhesion of pQiated bacteria to host tissues are 
provided Such compounds and compositi ns 
prevent or inhibit pili adhesion to host tissues 
by interacting with the mannose*binding 
domains on pUus adhesin sufaunits. Also 
provided are methods for the treatment or 
[Hevention of diseases caused by tissue-adhering pihis-fprming bacteria by interacticm with die binding between pilus subunits; 
the binding between pilus subunits and periplasmic chaperones; and the binding of a pilus adhesin to the host ^thelial tissue. 
Also provided are pharmac»itical preparati<ms capable of interacting with the binding b^ween pilus subunits, between pOus 
subunits and periplasmic chaperones and between the pilus adhesin. The present invention furth r relates to co-crystals of pilus 
chaperone-subunit co-cconq>lexes, detailed three dimensional structural infonnation illustrating the interaction between pflus 
subunits and/or between a pilus subunit and a chaperone for a pilus diaperone-subunit co-complex and methods of utilizing the 
X-ray ciystaOographic data from such co-crystals to design, identify and screen for compounds that exhibit antibacterial activity. 
The present inventi n also relates to machine readable media embedded with the three-dimensional atomic structure coordinates of 
pihis chaperone-subunit co-complex and subsets thereof. 
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FURTHER INFORMATION CONTINUED FROM PCTVISA/ 210 



This International Searching Authority found multiple (groups of) 
Inventions In this International application, as follows: 

1. Claims: 1-3,6,16-22,24-95 (all In part); 4,7-9 (complete) 

Compounds binding to a pllus subunit groove and Inhibiting 
pllus assembly comprising a mimic of an ami no-terminal motif 
of a pllus subunit. Compositions comprising them and methods 
of identifying antibacterial compounds. 



2. Claims: 1-3,6,13,16-22, 

24-95 (all in part); 10-12 (complete) 



Compounds binding to a pllus subunit groove and inhibiting 
pllus assembly comprising a mimic of a chaperone 61 
beta-strand. Compositions comprising them and methods of 
identifying antibacterial compounds. 



3. Claims: 14-15,23 (complete) 

Mannose analogues capable of competitively binding the 
ami no-terminal mannose binding domain and method of 
inhibiting pill adhesion. 
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Continuation of Box 1.2 

Claims Nos.: 1-8,10-11,13-95 (all partially) 



Present claims 1-3 relate to compounds defined by reference to a 
desirable characteristic or property, namely the ability of binding to a 
pllus subunit groove. The claims cover all compounds having this 
characteristic or property, whereas the application provides support 
within the meaning of Article 6 PCT and disclosure within the meaning of 
Article 5 PCT for only a very limited number of such compounds. In the 
present case, the claims so lack support, and the application so lacks 
disclosure, that a meaningful search over the whole of the claimed scope 
Is Impossible. Independent of the above reasoning, the claims also lack 
clarity (Art. 6 PCT). An attempt Is made to define a compound by 
reference to a result to be achieved. Again, this lack of clarity In the 
present case Is such as to render a meaningful search over the whole of 
the claimed scope Impossible. 

In addition, claims 4-8,10-11 relate to an extremely large number of 
possible products. Formulas consisting virtually of variables cannot be 
considered to be a clear and concise definition of patentable 
subject-matter (Art. 6 PCT). The claims so lack support, and the 
application so lacks disclosure (Art. 5 PCT), that a meaningful search 
over the whole of the claimed scope Is impossible. 
Consequently, the search has been carried out for those parts of the 
claims which appear to be clear, supported and disclosed, and has been 
directed to the general concept as defined In claim 1 and to compounds 
selected from the group listed In claims 9 and 12 (e.g.: SEQ IDs NO: 
1-52) and extended to claims 13-95 only In so far they relate to the 
subject-fliatter of claims 9 and 12. 

The applicant's attention Is drawn to the fact that claims, or parts of 
claims, relating to Inventions In respect of which no International 
search report has been established need not be the subject of an 
International preliminary examination (Rule 66, 1(e) PCT). The applicant 
Is advised that the EPO policy when acting as an International 
Preliminary Examining Authority Is normally not to carry out a 
preliminary examination on matter which has not been searched. This Is 
the case Irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 
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(54) Title: ANTI-BACTERLU- COMPOUNDS DIRECTED AGAINST PBLUS BIOGENESIS, ADHESION AND ACnVlTY- 
CO-CRYSTALS OF PILUS SUBUNITS AND METHODS OF USE THEREOF 

(57) Abstract: Many Gram-negative padiog ns 
assemble adhesive stnictures on their surfaces that 
allow them to colonize host tissues and cause disease. 
Novel ^compositions for the prevmtion or inhibition 
of pilus assembly in Gram-negative pathogens are 
disclosed. Interacting with the binding site of pili 
subunits will negatively affect the chaperoneAisher 
pathway which is one molecular mechanism by 
which Gram-negative bacteria assemble adhesive pili 
structures and thus prevent or inhibit pilus assembly. 
Additionally, novel compoimds and compositi ns for 
interfering or preventing adhesion of piliated bacteria 
to host tissues are provided. Such coii^x>unds fl«<< 
compositiQns prevent or inhibit pili adhesion to host 
tissues by interacting with the numnose-binding 
domains on pilus adhesin subunits. Also provided are 
methods for the treatment or ptevendon of diseases 
caused by tissue-adhering pUus-fonning bacteria by 
interaction with the binding between pihis subunits; 
the binding between |rilus subunits a«H peri^asmic 
■ «A 1- 1 . —^^^^M^^^^^^^— — cfaapenmes; and the binding of a pihis adhesin to 
the host qndielial nssue. Also provided axe pharmacaitical preparations capable of mteractmg with the bmding between pilus 
subumts, becweoi pUus subunits and periplasmic chaperones and between the pUus adhesin. The present invention fimher relates to 
co-crystals of pilus chaperane-subunit co-cconq)lexes, detailed diree dunensional stnicmral information iUustrating die inier^on 
between pilus subunits and/or between a pUus subunit and a chaperone for a pilus chs^jerone-subunit co-complex and methods 
of utilizmg die X-ray crystall graphic data from such co-crystals to design, identify and screen f r conqwunds that exhibit 
antibacterial activity. The present invention also relates to machine readable media embedded witfi th dnee-dimensional atomic 
snucture coordinates of pilus diaperone-subunit co-complex and subsets dieieof. 
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ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 
BIOGENESIS, ADHESION AND ACTIVITY; CO-CRYSTALS OF 
PILUS SUBUNITS AND METHODS OF USE THEREOF 

This invention was made in part with Government support under National Institutes of 
Health Grants RO1DK51406, R01AI29549 and RO1GM54033. The Government has certain 
rights in the invention. 

This supplication claims priority to co-pending United States provisional patent 
application Ser. No. 60/148^80, filed August 1 U 1999, incorporated herein by reference. 

Field of the Invention 

The present invention relates to compounds and methods for .the treatment of diseases 
caused by tissue-adhering pilus-foxming bacteria. More specifically, the invention relates to 
pharmaceutical preparations comprising substances capable of interfering with the binding of 
periplasmic chaperones to pilus subunits as well as pharmaceutical compounds capable of 
interfering with the binding between piltis subunits. 

The present invention further relates to crystalline forms of pilus-subimit co- 
complexes, the high-resolution X-ray diffraction stmctures and atomic structure coordinates 
obtained therefirom. The pilus subunit co-crystals of the invention and the atomic stmctural 
information obtained therefirom are usefiil for solving structures of related proteins, and for 
screening for, identifying and/or designing compounds that bind periplasmic chaperones or 
pilus subunits and thus prevent the assembly and/or biological fimction of pili. 

Background of the Invention 

Many pathogenic Gram-negative bacteria such as Escherichia colU Haemophilus 
influenzae. Salmonella enteriditis. Salmonella typhimurium, Bordetella pertussis. Yersinia 
enterocolitica. Yersinia perstis, Helicobacter pylori and Klebsiella pneumoniae assemble 
hair-like adhesive organelles called pili on their surfaces. Pili are thought to mediate 
microbial attachment, often the essential first step in the development of disease, by binding 
to receptors present in host tissues and may also participate in bacterial-bacterial interactions 
important in biofilm fomiation. 
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Uropathogenic strains of E. coli express P and type 1 pili that bind to receptors present 
in uroepithelial cells. Adhesive P pili arc virulence detemiinants associated with 
pyelonephritic strains of £1 coli whereas type 1 appear to be more common in £. coli causing 
cystitis- The adhesin present at the tip of the pilus, PapG binds to the Gal (l-4)Gal moiety 
present in the glycolipids and glycoproteins, while the type 1 adhesin, FimH, binds D- 
mannose present in glycolipids and glycoproteins. 

Type 1 pili are adhesive fibers expressed in E. coli as well as in most of the 
Enterobacteriaceae family. The type 1 pilus is a right handed helix with about 3 subunits per 
tum^ a diameter of approximately 70 A, a central pore of about 20-25 A, and a rise per 
subunit of about 8 A. See G.E. Soto et al., EMBOJ., 17: 6155 (1998). Type 1 pili are 
composite stmctures in which a short tip fibrillar stmcture containing FimG and the FimH 
adhesin (and possibly the minor component FimF as well) are joined to a rod comprised 
predominantly of FimA subunits. See Jones et al., Proc. Natl. Acad. ScL U.S.A., 92: 2081 
(1995). The FimH adhesin mediates binding to maimose-oligosaccharides. S'eeS.N. 
Abraham et al.. Nature, 336: 682 (1988); K.A. Krogfelt et al,. Infect. Immun,, 58: 1995 
(1990). In uropathogenic E. coli, this binding event has been shown to play a critical role in 
bladder colonization and disease. 

Type 1 pilus biogenesis proceeds by way of a highly conserved chaperone/usher 
pathway that is involved in the assembly of over 25 adhesive organelles in the Gram-negative 
bacteria. See G.E. Soto and S. Hultgren, J. Bacterid.^ 181: 1059 (1999). The usher fomis an 
oligomeric channel in the outer membrane with a pore size of approximately 2.5 mn and 
mediates subunit translocation across the outer membrane. See D.G. Thanassi et al., Proc. 
Natl. Acad. U.S.A., 95: 3146 (1998). 

P pili is a heteropolymeric surface fiber with an adhesive tip and consists of two major 
sub-assemblies, the pilus rod and the tip fibrillum. The pilus rod is a thick rigid rod made up 
of repeating Pap A subunits arranged in a right-handed helical cylinder whereas the tip 
fibrillum is a thin, flexible tip fiber extending fi-om the distal end of the pilus rod and is 
composed primarily of repeating PapE subunits arranged in an open helical configuration. 
Two components of the tip fibrillum, PapK and PapF, act as adaptors. PapK is thought to 
link the pilus rod to the base of the tip fibrillum and regulates the length of the tip fibrillum: 
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its incoiporation tenmnates its growth and nucleates the fonnation of the pilus rod PapF is 
thought to join the PapG adhesin to the distal end of the flexible tip fibrillum. 

The biogenesis of P pili also occurs via the highly conserved chaperone/usher 
pathway. See T.G. Thanassi et al., Curr. Opin. Microbiol., 1 : 223 (1998); D.L. Hung et al., 
EMBOJ., 15: 3792 (1996). P pili are adhesive organelles encoded by eleven genes in die pap 
(Eilus associated with eyelonephritis) gene cluster found on the chromosome of 
uropathogenic strains Of E. colt Six genes encode structtiral pilus subunits, PapA, PapH, 
PapK, PapE, PapF and PapG. See S.J. Hultgren et al.. Cell 73: 887 (1993). 

In P pili, two of the genes in Hhspap op&ron,pqpD and pqpQ encode the chapeione 
and usher, respectively. Chaperones such as PapD in E. coli are required to bind to pilus 
proteins imported into the periplasmic space, partition them into asswnbly component 
complexes and prevent non-productive aggregation of the subunits in the periplasm. See 
Kuehn M. J. et al.. Proc. Natl. Acad. ScL USA 88: 10586 (1991). PapD is a periplasmic 
chaperone that mediates the assembly of P pill. Detailed structural analysis has revealed that 
the PapD ch^erone is the prototype member of a conserved family of periplasmic 
chaperones in Gram-negative bacteria. Periplasmic chaperones consist of two 
immunogloblin-like domains with a deep cleft between the two domains. See A. Holmgren 
and CI. Branden, Nature, 342: 248 (1989); M. Pellecchia et al.. Nature Struct. Biol., 5: 885 
(1998). Further, all members of the periplasmic chaperone superfamily have a conserved 
hydrophobic core that maintains the overall features of tiie two domains. 

Periplasmic ch^orones, along with outer membrane ushers, constitute a molecular 

mechanism necessary for guiding biogenesis of adhesive organelles in Gram-negative 
bacteria. These chaperones function to cap and partition interactive subunits imported mto 
the periplasmic space into assembly conq)etent co-complexes, making non-productive 
interactions unfavorable. The chaperone-subunit co-complexes are targeted to the outer 
membrane usher where subunits, or ushers, assemble in a specific order to form a pilus. 
During pilus biogenesis, PapD binds to and caps interactive surfaces on pilus subunits and 
prevents their premature aggregation in the periplasm. PapD binds to each of the pilus 
subunit types as tfiey emerge from the cytoplasmic membrane and escorts them in assembly- 
competent, native-like conformations from die cytoplasmic membrane to outer membrane 
assembly sites comprised of PapC. PapC has been termed a molecular usher since it receives 
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chaperone-subunit co-complexes and incorporates, or ushers, the subunits from the chaperone 
co-complex into the growing pilus in a defined order. 

In the absence of an interaction with the chaperone, pilus subunits aggregate and ar« 
proteolytically degraded. Kolmer et al. and Jones et al. have shown that the DegP protease 
degrades pilus subunits in the absence of the chaperone. See J. Bacterial. 178: 5925 (1996); 
BIBO, 16:6394(1997). This discovery led to the elucidation of the fete of pilus subunits 
expressed in the presence or absence of the chaperone using monospecific antisera in Western 
blots of cytosolic membrane, outer membrane and perplasmic proteins prepared according to 
methods known in the art 

Thus, prevention or inhibition of normal pilus assembly in Gram-negative bacterium 
impacts the pathogenicity of the bacterium by preventing the bacterium frpjn attaching to and 
infecting host tissues. Moreover, changes in the binding between pilus subaiiits and 
chaperones can have a dramatic impact on the efficiency of pilus assembly, and thus on the 
ability of Gram-negative^bacterium to adhere to and consequentially, infect host tissues. 
Prevention and inhibition of binding between pilus subunits and between pUus subunits and 
periplasmic chaperones have the effect of inq)airing pilus assembly, whereby the infectivity 
of the Gram-negative bacterium expressing the pill is reduced. Accordingly, a need exists, in 
general, for compositions and methods for preventing or inhibiting the normal interaction 
between pilus subunits and/or betweoi a pilus subunit and a chaperone. 

However, identification of such conqiositions has heretofore relied on serendipity 
and/or systematic screening of large numbers of natural and synthetic compounds. A far 
superior me&od of drug-screening relies on structure-based drug design. The three 
dimensional structures of proteins or protein fragments are determined and potential agonists 
and/or potential antagonists are designed with the aid of computer modeling. However, 
heretofore the three-dimensional structure illustrating the interaction between pilus subunits 
and/or between a pilus subunit and a chaperone has remained unknown, essentially because 
no such protein co-crystals had been produced which would permit the required X-ray 
crystallographic data to be obtained. 

Therefore, there is presently a need for obtaining a co-crystal of a co-complex of a 
pilus and a chaperone to allow such crystallographic data to be obtained. Furthermore there 
is a need for the determination of the three-dimensional structure of such co-crystals. Finally, 
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there is a need for procedures for related structural based drug design based on such 
crystallographic data. 

Summary of the Invention 

Accordingly, the present invention provides antibacterial compositions and 
compounds capable of inhibiting or preventing pilus assembly in a Gram-negative bacterium. 
Such compounds interfere with the function of chaperones required for the assembly of pili 
from pilus subunits in diverse Gram-negative bacteria. Another object of the invention is to 
provide compounds having antibacterial activity that prevent or inhibit pili assembly by 
interfering with the interactions between pilus subunits. Yet another object of the invention is 
to provide compounds capable of inhibiting or preventing the function of pili adhesion to host 
epithelium thereby reducing the edacity of bacteria to attach to and infect host tissues. It is a 
further object of the invention to provide antibacterial con^^oimds which have broad 
specificity for a diverse group of Gram-negative bacteria. Other objects include the provision 
of methods of preventing and inhibiting pilus assembly, methods of preventing or inhibiting 
pili adhesion to host tissues, methods of treating bacterial infections, methods for preventing 
and inhibiting biofilm formation and methods of preventing colonization by various Gram- 
negative bacteriiun. 

Another aspect of the invention is to provide crystalline forms of polypeptides 
corresponding to a pilus chaperone-subunit protein co-complex. Thus, further objects of the 
present invention include the provision of the atomic structure coordinates obtained from the 
pilus chaperone-subunit co-crystals and methods of utilizing the three dimensional stmctural 
information obtained from the co-crystals to design or identify compounds with antibacterial 
activity. Another related object is to provide machine- or computer-readable media 
embedded with the three-dimensional stmctural information obtained from the pilus 
chaperone-subimit co-complex, or portions or subsets thereof which can be used to identify or 
design antibacterial compounds. A further object is to provide methods of making the co- 
crystals of the invention. 

Therefore, in one aspect, the present invention is directed to isolated and purified 
compounds and synthesized compounds which bind to a pilus subunit groove and thus inhibit 
pilus assembly. Preferably, such compounds mimic the binding activity of the G| beta-strand 
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of a periplasmic chaperone and comprise a polypeptide having an amino acid sequence 
containing at least two alternating hydrophobic amino acid residues. In a preferred 
embodiment, this polypeptide would be derived from a G, beta-strand of a periplasmic 
chaperone, more preferably, this polypeptide would be comprised of amino acids derived 
from the NlOl to L107 amino acid region of a beta-strand of a periplasmic chaperone. A 
particularly preferred antibactepal compound which comprises a peptide comprising an 
amino-terminal amino acid sequence Asn-Val-Leu-Ghi-Ile-Ala-Leu (SEQ ID NO: 1) or any 
related analogues that would competitively bind to die binding site of a pilus subunit. 

In another embodiment, such compounds mimic the binding activity of the amino- 
terminal end of a pilus subunit and comprise a polypeptide having an amino acid sequence 
containing at least two altemating hydrophobic amino acid residues. Such antibacterial 
compounds will competitively bind to a binding site on pilus subunits, thereby inhibiting or 
preventing pilus assembly. A preferred polypeptide would be derived from the sequences of 
conserved amino-termixud motifi of pilus subimits. A particularly preferred antibacterial 
compound comprises a peptide comprising an amino-terminal amino acid sequence Ser-Asp- 
Val-Ala-Phe-Arg-Gly-Asn-Leu-Leu (SEQ ID NO: 12) or any related analogues that would 
competitively bind to the binding site of a pilus subunit 

A frirther object of the invention is to provide compoimds which mimic mannose by 
binding to the amino-terminal end of the FimH adhesin. Such antibacterial compounds will 
bind to the mannose-binding site on pilus adhesins, thereby inhibiting or preventing the 
function of the pili to attach to and infect host tissues. 

Interference with pili assembly and prevention of the capacity of pili to attach to host 
tissues are particularly effective since both the formation of pili and attachment of pili to host 
tissues are essential to bacterial pathogenicity. As such, the invention further provides 
compositions containing the above compounds in conjunction with a pharmaceutically- 
acceptable carrier, excipient or diluent. Also provided are methods of preventing or 
inhibiting pilus assembly in a Gram-negative bacterium by administering an effective amount 
of a compound capable of interfering with the binding of pilus subunits and all pilus subunit 
homologues. The invention is also directed to methods of preventing or inhibiting the 
pathogenicity of a Gram-negative bacteriiun comprising administering an effective amount of 
a compound capable of interfering with the adhesion of pili to host tissues. Further provided 
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are methods for treating Gram-negative infections which comprise providing to a subject an 
effective amount of the above compounds and compositions. 

Further, the present invention is directed to methods for preventing or inhibiting 
biofikn formation on a surface or in an environment containing Gram-negative bacteria. Also 
provided are methods for inhibiting bacterial colonization by a Gram*negative organism. 
These methods are accomplished by administering to such surfaces and environments an 
effective amount of a compound or a composition which is capable of interfering with pilus 
assembly or the ability of the pilus to adhere to and subsequently infect host tissues. 

In another aspect, the invention provides compositions comprising crystalline forms 
of polypeptides corresponding to the PapD-PapK chaperone-pilus subunit protein co- 
complex. The PapD-PapK co-crystals comprise crystallized polypeptides corresponding to 
the wild-type or mutated PapD-P^K co-complexes. The PapD-PgqjK co-crystals preferably 
include native co-crystals, heavy-atom atom derivative co-crystals and co-crystals of a PapD- 
PapK co-complex that is further associated with one or more other molecules or compounds. 
Preferably, such other compounds bind to a site involved in protein-protein interactions in the 
pilus. 

The Ps^O-P^K co-crystals are generally characterized by a spacegroup of P2,2,2„ 
and a unit cell of a* 62.1 ± 0.2 A, b= 63.6 ± 0.2 A, c= 92.7 ± 0.2 A. and are preferably of 
difSraction quality. In a prefenred embodiment, the PapD-PapK co-crystals are of sufficient 
quality to permit the determination of the three-dimensional X-ray dif&action stracture of the 
crystalline polypeptide co-complex to high resolution, preferably to a resolution of greater 
than about 3 A, typically in the range of about 1 A to about 3 A 

The invention also provides methods of making the co-crystals of the invention. 
Generally, co-crystals of the invention are grown by dissolving substantially pure 
polypeptides in an aqueous buffer that includes a precipitant at a concentration just below that 
necessary to precipitate the polypeptide. Water is then removed by controlled eviration to 
produce precipitating conditions, which are maintained until co-crystal growth ceases. 

. In another aspect, the invention provides machine- or computer-readable media 
embedded with the three-dimensional stmctural information obtained fix)m the PapD-PapK 
co-crystals of the invention, or obtained from FimC-FimH co-crystals, or portions or subsets 
thereof. Such three-dimensional structural information will typically include the atomic 
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structure coordinates of the crystallized polypeptide co-complex, or the atomic structure 
coordinates of a portion thereof, such as, for example, the atomic structure coordinates of one 
member of the co-complex or an active or binding site of one or both members, but may 
include other structural infomiation, such as vector representations of the atomic structure 
coordinates, etc. 

Thus, the atomic structure coordinates and machine readable media of ttie inveiition 
have a variety of uses. As such, provided are methods of identifying antibacterial compounds 
which utilize the coordinates for solving the three-dimensional X-ray diffraction and/or 
solution structures of other proteins, including mutant co-complexes, co-complexes further 
associated with other molecules, and unrelated proteins, to high resolution. Stractural 
information may also be used in a variety of molecular modeling and computer-based 
screening applications to, for example, intelligently design mutants of the crystallized PapD- 
PapK or FimC-FimH co-complexes having altered biological activity and to computationally 
design and identify compounds that bind the polypeptide co-complexes or a portion or 
fragment of the polypeptide co-complexes, such as the maimose binding site of FimH and/or 
the G| beta strand binding cleft of PapK. 

In another aspect, the present invention provides methods of using the coordinates of 
the PapD-PapK co-complex or of the FimC-FimH co-complex, or subsets of such stmcture 
coordinates, to design or identify candidate compounds capable of binding to a binding site 
on one member of the co-complex, or of a memb« of a related co-complex. Such candidate 
compounds may be evaluated for biological activity, such as, for example, the ability to bind 
(preferably competitively) the subunit of interest, the ability to disnq)t chaperone-pilus 
subunit assembly and/or the ability to avoid adherence of a Gram-negative bacterium to a 
host tissue. In one embodiment, the co-crystals from which the PapD-PapK co-complex 
structure is derived have the space group and cell dimensions described above, such that the 
three dimensional structure of the co-complex is provided to a resolution of from about 3.0 A 
to about 2.4 A or greater. In another embodiment, the co-crystals fix>m which the FimC- 
FimH co-complex stmcture is derived have the space group P4|2,2 or P43 with unit cell 
dimensions of a=b= 97,7 +/- 0.2 A and c- 215.9 +/- 0.2 A, such that the three dimensional 
stmcture of the co-complex can be determined to a resolution of from about 3.6 A to about 
2.5 A or greater. 
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In a further aspect of the invention, such potential compounds are evaluated for 
biological activity. Candidate antibacterial compounds are designed or identified using the 
atomic structure coordinates of the PapD-PapK or FimC-FimH co-complexes or subsets 
thereof synthesized and screened for their ability to bind to pilus subunits, thereby inhibiting 
or preventing pilus biogenesis. The antibacterial activity of the compound is determined by 
assaying the bacterium for infectivity or monitoring the pilus for activity. Alternatively, 
compounds designed or identified based upon their ability to bind the mannose binding 
domain of FimH are synthesized and screened for their ability to bind FimH. Such 
compounds that are able to prevent or inhibit pilus biogenesis or the ability of the bacterial 
pilus to attach to a host tissue can be used in the compositions of the present invention. 

Other objects and features will be in part apparent and in part pointed out hereinafter. 

Brief Desc ription of Figures 

FIgl A is a depiction of representative regions of the electron density of a Ps^D G, 
beta-strand. Electron density is ftom a simulated annealing omit map calculated using the 
phases derived fiom the final model where the PapD G, beta-strand residues 101 to 108 have 
been omitted. Strands are labeled. 

fig IB is a depiction of representative regions of Electron density shown in PapD Gj 
beta-strand zippering to the ?dpK F strand. The density is^ from a rn^p calculated using 
unbiased experimental MAD solvent-flattened phases. 

Fig IC is a view fiom the hydrophobic core of P^K looking out toward the PapD G, 
beta-strand that inserts into the groove of the subunit. Residues throughout are labeled. The 
density is from a map calculated using imbiased experimental MAD solvent-flattened phases. 

Fig. 2A is a schematic of a stereo ribbon diagram. Subscripts 1 and 2 refer to 
domains 1 and 2 of PapD, respectively. 

Fig 2B is a stereo ribbon diagram. The molecular surface of PapK, calculated and 
displayed using GRASP. The stmcture of PapD is shown as a ribbon. The insertion of the G| 
beta-strand of PapD into a deep groove on the surface of Pq)K can be seen. 

Fig, 3 A is the topology of PapK. Beta-strands are indicated as arrows, while helices 
(either a or 3,o) are shown as cylinders. 



wo 01/10386 



PCT/US6d/22087 



10 

Fig. 3B is a dqjiction of the sequence alignment of P-pilus subunits (PapA, PapK, 
PapE, and PapF). The secondary structural elements of PapK are indicated above the aligned 
sequences. Residue numbers of PapK are indicated above the PapK sequence. The 
remarkable conservation of structurally and functionally important residues strongly indicates 
that all pilins have stmctures sintiilar to PapK. 

Fig. 3C is a depiction of the secondary stmcture definition of PapD. Residue numbers 
are indicated above the sequence, while secondary structural elements are indicated below it. 

Fig. 4 depicts the superposition of the structures of apo-PsqpD and PapD complexed to 
PapK. The arrow indicates die conformational change in the F,-G| loop upon subunit 
binding. # 

Fig. 5 is the definition of the binding sites in PapD and PapK. On the left, PapD is 
shown as a space-filling model and PapK as a ribbon. On the right, PapK is shown as a 
space-filling model and PapD as a ribbon. The various binding sites as defined in the text are 
labeled. 

Fig. 6 A is a schematic of a stereo contact diagram of interactions between P^q^D and 
the NH^-terminus of PapK. Residues making contacts are shown in stick representation (thin 
for PapD, and thick for P^K). 

F^. 6B is a schematic of a stereo contact diagram of interactions between PapD and 
the COOH-tenninal F strand of PapK. The NHz^teiminal strand A and the COOH-terminal 
strand F form the sides of the groove in PapK. Residues making contacts are shown in stick 
rq)resentation (thin for PapD, and thick for PapK). 

Fig. 6C is a schematic of a stereo contact diagram of interactions between PapK and 
domain 2 of PapD. Residues making contacts are shown in stick representation (thin for 
PapD, and thick for PapK). 

Fig. 6D is a schematic of a stereo contact diagram of interactions between the 
terminal carboxylate of PapK with PapD. Residues making contacts are shown in stick 
representation (thin for PapD, and thick for PapK). 

Fig. 6E is a depiction of the G, beta-strand of PapD as it inserts into the groove of 
PapK. The Ps^D G| strand is represented as a stick model with color coding as in Fig. 6A 
and PapK is shown as a molecular surface calculated using GRASP. Notice the 



wo 01/10386 PCTAJSOO/22087 

11 

predominance of hydrophobic residues in the groove, the base of which is part of the 
hydrophobic core of the protein. 

Fig. 7A is a schematic diagram of subunit-subunit interactions in pilus rod model as 
viewed from above. Insertion of the NHi-terminal strand of one subunit into the groove itnade 
by the A2 and F strands of the preceding subunit such that the NHj-terminal strand is parallel 
to strand F results in a three-pointed-star-shaped cross-section inconsistent with electron 
microscopy data. Strands (arrows) are labeled, as are the NHj- and COOH-temiini (N and C 
respectively). Hydrogen bonding interactions are shown schematically. 

Fig. 7B is a schematic diagram of subunit-subunit interactions in pilus model as 
viewed from above. Insertion of the NHj-temmial strand antiparallel to strand F yields a 
cross-section compatible with electron microscopy data Strands (arrows) are labeled, as are 
the NH2* and COOH-temiini (N and C respectively). Hydrogen bonding interactions are 
shown schematically. 

Fig. 7C is a molecular surface of a pilus rod (program GRASP). The disordered 
residues at the NHj-terminus of the subunit were modeled as a strand that inserts into the 
groove of the preceding subunit. Approximately three turns of the model pilus, whose 
dimensions are similar to the known values from electron microscopy are shown. 

Fig. 7D is a stereo ribbon diagram of the rod model. The insertion of the NH2- 
terminal strand of one subunit into the groove of the preceding subunit can be clearly seen. 

Fig. 8A depict the amino acid sequences of type 1 pilus subunits (FimA, FimF, FimG, 
FimH). The end of the mannose binding lectin domain and the start of the pilin domain in 
FimH are indicated by vertical arrows above the sequences. Type 1 pilin subunits (FimA, 
FimF, FimG) were aligned with the pilin domain of FimH using Clustal W and manually 
adjusted to minimize gaps in secondary stmcture elements. Gaps in the alignment are 
indicated by dots. Sequence numboing for FimH starts at position 22 in the pre-protein. 
Residues involved in chaperone binding are indicated by an open circle above the residue. 
Residues in the caibohydrate binding pocket are boxed. A large box marks the NHa-terminal 
extensions in the pilin subunits, The conserved b-zipper motif found in all pilin subiuiits 
corresponds to the F beta-istrand. Limits and nomenclature for secondary stmcture elements 
are shown below the sequence. 
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Fig SB are beta-sheet topology diagrams of the mannose binding domain (left) and 
pilin domain (right) of FimH. 

Fig 9A is a typical sample of the solvent flattened experimental electron density map 
(contoured at l.Oc) with the refined model superimposed Arg^ and Lys"^ anchor the 
COOH-teiminus of FunH in the subunit binding cleft of the chaperone via hydrogen bonds to 
the terminal carboxylate* 

Fig. 9B is a MOLSCRIPT ribbon diagram of the FimC-FimH co-complex, A ball- 
and-stick representation of the C-HEGA molecule bound to the lectin domain of FimH 
indicates the position of the carbohydrate-binding site at the tip of the domain. 

Fig. 1 OA is a depiction of FimH carbohydrate binding. A stereo view of the 
carbohydrate binding pocket with a molecule of C-HEGA boimd. Residues Phe'", Ile^^^ 
Asn^, Asp^^. Tyr«». De^^, Asp^« Ghi"^«, Asn"««. Tyr"'«. Asn»»« Asp^^«. Phe'^^« line the 
surface of the pocket at the tip of die lectin domain is shown. Residues that take part in 
hydrogen bonding to tiie glucamide moiety of C-HEGA are labeled. 

Fig. lOB is a depiction of the surface of the FimH pilin domain showing the exposed 
hydrophobic core. Hydrophobic residues that are in contact with FimC in the co-complex but 
solvent exposed upon removal of the chaperone are highlighted in yellow. Right: as left but 
with FimC ribbon in blue. The seventh Gl strand of FimC donates hydrophobic residues to 
complement the incomplete hydrophobic core of the pilin domain. 

Fig. IOC is a close-up of donor strand complementation interactions. Hydrophobic 
residues on the surface of the pilin domain (Val'«". Ala'^^, TV««, Be"'", Leu"'«, Val^«, 
Leu^, Ile^™, Val^^^". and Phe'^^) and FimC residues involved in donor strand 
complementation (Leu^°^^, Leu}"^, Ile'**'^ Ser'^, Ile"'^) pack against each other to form a 
complete hydrophobic core extending between the two proteins. 

Fig. 11 A is a model of the type 1 pilus. 

Fig. IIB is a top view of the type 1 pilus. Residue positions that are subject to allelic 
variation map to the outer surface of the pilus. 

Fig. lie is a side view of the type I pilus. 

Fig. 12 is a graphic representing the binding of FimH to polypeptides corresponding 
to the Gl beta-strand of FimC and the N-terminal extension of FimC. The two polypeptides 
or FimC were coated onto microtiter wells and FimH binding to the irxunobilized 
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polypeptides or FimC protein was determined by ELISA using anti-FimH antibodies. The 
graph represents the average of triplicate wells with the standaid deviation shown in bars. 

Fig. 13 is a graph which represents the binding of FixnH in the presence of increasing 
concentrations of the FimC polypeptide. It can be seen that FimC polypeptides inhibit FunH 
binding to FimC. The graphs rq>resent the average of triplicate wells with the standard 
deviation shown in bars. 

Fig. 14 is a graph which represents the FimH binding to FimC in the presence or 
absence of FimG or FimC polypeptides as monitored by ELISA. The graphs represent the 
average of triplicate wells with the standard deviation shown in bars. 

Abbreviations and Definitions 
To facilitate understanding of the invention, a numb« of terms are defined below: 
The amino acid notations used herein for the twenty genetically encoded L-amino 
acids are conventional and are abbreviated as follows: 



Amino Acid 


One-Letter 
Symbol 


Three-Letter 
Symbol 


Alanine 


A 


Ala 


Arginine 


R 


• Arg 


Asparagine 


• N 


Asn 


Aspartic acid 




; Asp 


Cysteine 


Cys 


Glutamine 




Gin 


Glutamic acid 


• E 


Glu 


Glycine 




Gly 


Histidine 


H 


His 


Isoleucine 


I 


ne 


Leucine 


L 


Leu 
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One-Letter 
Symbol 


Three-Letter 
Symbol 


Lysine 


K 


Lys 


Methionine 


M 


Met 


Phenylalanine 


F 


Phc 


Proline 


P 


Pro 


oenne 


S 


S«r 


Threonine 


- T 




Tryptophan 


W 


Tip. 


• Tyrosine 


y 


Tyr 


Valine 


V 


Val 



As used herein, unless specifically delineated otherwise, the three-letter and one-letter 
amino acid abbreviations designate amino acids in either the D-configuration or the L- 
configuration. For example, Arg designates D-arginine and L-aiginine, and R designates D- 
arginine and L-arginine. 

Unless noted otherwise, when polypeptide sequences are presented as a series of one- 
letter and/or three-letter abbreviations, the sequences are presented m the N C direction, in 
accordance with conunon practice. As used herein, "C" refers to the alpha carbon of an 
amino acid residue. 

For purposes of determining conservative amino acid substitutions in the various 
polypeptides described herein and for describing the various peptide and peptide analog 
compounds, the amino acids can be conveniently classified into two main categories - 
hydrophilic and hydrophobic— depending primarily on the physical-chemical characteristics 
of the amino acid side chain. These two main categories can be fiirther classified into 
subcategories that more distinctly define the characteristics of the amino acid side chains. 
For example, the class of hydrophilic amino acids can be fiirther subdivided into acidic, basic 
and polar amino acids. The class of hydrophobic amino acids can be finther subdivided into 
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apolar and aromatic amino acids. The definitions of the various categories of amino acids are 
as follows: 

"Hydrophilic amino acid" refers to an amino acid exhibiting a hydrophobicity of less 
than zero according to the normalized consensus hydrophobicity scale of Eisenberg et al., 
1984, J. MoL BioL 179:125-142. Genetically encoded hydrophilic amino acids include Thr 
(T), Ser (S), His (H), Glu (E), Asn (N), Gin (Q), Asp (D), Lys (K) and Arg (R). 

"Acidic amino acid** refers to a hydrophilic amino acid having a side chain pK value 
of less than 7. Acidic amino acids typically have negatively charged side chains at 
physiological pH due to loss of a hydrogen ion. Genetically encoded acidic amino acids 
include Glu (E) and Asp (D). 

"Basic amino acid" refers to a hydrophilic amino acid having a side chain pK value of 
greater than 7, Basic amino acids typically have positively charged side chains at 
physiological pH due to association with hydronium ion. Genetically encoded basic amino 
acids include His (H), Arg (R) and Lys (K). 

"Polar amino acid" refers to a hydrophilic amino acid having a side chain that is 
uncharged at physiological pH, but which has at least one bond in which the pair of elections 
shared in conunon by two atoms is held more closely by one of the atoms. GCTietically * 
encoded polar amino acids include Asn (N), Gin (Q) Ser (S) and Thr (T), 

"Hydrophobic amino acid" refers to an amino acid exhibiting a hydrophobicity of 
greater than zero according to the normalized consensus hydrophobicity scale of Eisenberg, 
1984, J, Mol. Biol. 179:125-142. Genetically encoded hydrophobic amino acids include Pro 
(P), lie (I), Phe (F), Val (V), Leu (L), Tip (W), Met (M), Ala (A), Gly (G) and Tyr (Y). 

"Aromatic amino acid" refers to a hydrophobic amino acid with a side chain having at 
least one aromatic or heteroaromatic ring. The aromatic or heteroaromatic ring may contain 
one or more substituents such as -OH, -SH, -CN, -F, -CI, -Br, -I, -NO^, -NO, -NHj, -NHR, 
-NRR, -C(0)R, -C(0)OH, .C(0)OR, -C(0)NH2, -C(0)NHR, .C(0)NRR and flie like where 
each R is independently (CrC^ alkyl, substituted (C,-C«) alkyl, (CrCJ alkenyl, substituted 
(Cj-Cfi) alkenyl, (0.'C^ alkynyl, substituted (C.C^ alkynyl, (Cj-Co) aryl, substituted (C5-C.0) 
aiyl, (Q-C^ft) arylalkyl, substituted (C^-Cg) arylalkyl, 5-20 membered heteroaryl, substituted 
5-20 membered heteroaryl, 6-26 membered heteroaiylalkyl or substituted 6-26 membered 
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heteroarylalkyl. Genetically encoded aromatic amino acids include His (H), Phe (F), Tyr (Y) 
andTrp(W). 

"Polar amino acid" refers to a hydrophobic amino acid having a side chain that is 
uncharged at physiological pH and which has bonds in which the pair of electrons shared in 
common by two atoms is generally held equally by each of the two atoms (i.e., the side chain 
is not polar). Genetically encoded apolar amino acids include Leu (L), Val (V), He (I); Met 
(M), Gly (G) and Ala (A). 

"Aliphatic amino acid** refers to a hydrophobic amino acid having an aliphatic 
hydrocarbon side chain. Genetically encoded aliphatic amino acids include Ala (A), Val (V), 
Leu (L) and lie (I). 

•*Hydroxyl-substituted aliphatic amino acid" refers to a hydrophilic polar amino acid 
having a hydroxyl-substituted side chain. Genetically-encoded hydroxyl-substituted aliphatic 
amino acids include Ser (S) and Thr (T). 

The amino acid residue Cys (C) is unusual in that it can form disulfide bridges with 
other Cys (C) residues or other sulfanyl-containing amino acids. The ability of Cys (C) 
residues (and other amino acids with -SH containing side chains) to exist in a peptide in either 
the reduced firee -SH or oxidized disulfide-bridged form affects whether Cys (C) residues 
contribute net hydrophobic or hydrophilic character to a peptide. While Cys (Q exhibits a 
hydrophobicity of 0.29 according to the normalized consensus scale of Eisenberg (Eisenberg, 
1 984, supra\ it is to be understood that for purposes of the present invention Cys (C) is 
categorized as a polar hydrophilic amino acid, notwithstanding the general classifications 
defined above. 

As will be appreciated by those of skill in the art, the above-defined categories are not 
mutually exclusive. Thus, amino acids having side chains exhibiting two or more physical- 
chemical properties can be included in multiple categories- For example, amino acid side 
chains having aromatic moieties that are further substituted with polar substituents, such as 
Tyr (Y), may exhibit both aromatic hydrophobic properties and polar or hydrophilic 
properties, and can therefore be included in both the aromatic and polar categories. As 
another example. His (H) has a side chain that falls within the aromatic and basic categories. 
The appropriate categorization of any amino acid will be apparent to those of skill in the art, 
especially in light of the detailed disclosure provided herein. 
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While the above-defined categories have been exemplified in tenns of the genetically 
encoded amino acids, the amino acid substitutions need not be, and in certam embodiments 
preferably are not, restricted to the genetically encoded amino acids. Indeed, since many of 
the compounds described herein may be produced synthetically, they may comprise one or 
more genetically non-encoded amino acids. Thus, in addition to the naturally occurring 
genetically encoded amino acids, amino acid residues in the core peptides of structure (I) may 
be substituted with naturally occurring non-encoded amino acids and synthetic amino acids. 

Certain commonly encountered amino acids of which the compounds of the invention 
may be comprised include, but are not limited to, 3-alanine (3-Ala) and other omega-amino 
acids such as 3-aniinopropionic acid, 2,3-diaminopropionic acid (Dpr), 4-aminobutyric acid 
and so forth; a-aminoisobutyric acid (Aib); e-aminohexanoic acid (Aha); 6-aminovaleric 
acid (Ava); N-methylglycme or sarcosine (MeGly); ornithine (Qm); citralline (Cit); 
t-butylalanine (t-BuA); t-butylglycine (t-BuG); N-methyhsoleucine (Mene); phenylglycme 
(Phg); cyclohexylalanine (Gia); norleucine (Nle); n^hthylalanine (Nal); 4- 
chlorophenylalanme (Phe(4.Cl)); 2.fluorophenylaIanine (Phe(2-F)); 3-fluorophenylalanine 
(Phe(3-F)); 4*fluorophenylalanine (Phe(4-F)); penicillamine (Pen)- 1,2,3,4- 
tetrahydroisoquinoline-3-carboxylic acid (Tic); |3-2-thienylaIanine (Thi); methiomne 
sulfoxide (MSG); homoarginine (hArg); N-acetyl lysine (AcLys); 2,4-diaminobutyric acid 
(Dbu); 2,3-diammobutyric acid (Dab); p-aminophenylalanine (PheO^NHj)); N-methyl valine 
(MeVal); homocysteine (hCys), homophenylalanine (hPhe) and homoserine (hSer); 
hydroxyproline (Hyp), homoproline (hPro), N-methylated amino acids and peptoids (N- 
substituted glycines). 

The classifications of the genetically encoded and common non-encoded amino acids 
according to the categories defined above are summarized in Table 1, below. It is to be 
understood that Table I is for illustrative purposes only and does not purport to be an 
exhaustive list of amino acid residues that can be used in the invention. Additional amino 
acids may be found in Fasman, 1989, Practical Handbook of Biochemistry and Molecular 
Biology, CRC Press; Inc., pp. 3-70, and the references cited therein. 



wo 01/10386 



PCT/US00a2087 



18 



TABLE 1: CLASSIFICATIONS OF COMMONLY ENCOUNTERED AMINO ACIDS 



Classificatioii 


Encoded 


non'-ijreneucajiy 
Encoded 


Hydrophobic 






Aromatic 


H.F.Y,W 


Phg, Nal, thi. Tic, Phe(4^), Phe(2-F). 
Phe(3-F), Phe(4-F). hPhe 


Apolar 


L,V,I.M,G,A,P 


t-BuA, t-BuG. Mefle, Nlc. McVal, CSia, 
McGly, Aib 


Aliphatic 


A.V.L.I 


b-Ala, Dpr, Aib, Aha, MeGIy, t-BuA, 
t-BuG, McDc, Cha, NIc, MeVal 


Hydrophilic 






Acidic 


D.E 




Basic 


H.K,R 


Dpr, Om, hArg, VbsQ^-imj, Dbu, Dab 


Polar 


C,Q,N,S,T 


Cit, AcLys. MSO, bAla, hSer 



As utilized herein, the term "pilus'* or •'pili'' relates to fibrillar heteropblymeric 
structures embedded in the cell envelope of many tissue-adhering pathogenic bacteria, 
notably pathogenic gram negative bacteria. In the present specification, the terms pilus and 
pili will be used interchangeably. A pilus is composed of a nimiber of "pilus subunits" which 
constitute distinct functional parts of the intact pilus. 

The term "chaperone" relates to a molecule which in living cells has the responsibility 
of binding to polypeptides in order to mature the polypeptides in a number of ways. Many 
molecular chaperones are involved in the process of folding polypq)tides into their native 
conformations whereas other molecular chaperones are involved in the export out of or 
import into the cell of polypeptides. Specialized molecular chaperones are ^'periplasmic 
chaperones** which are bacterial molecular chaperones exerting their main actions in the 
^periplasmic space." Specialized periplastxiic chaperones also have an hnmunoglobulin-like 
three dimensional stmcture. The periplasxnic space constitutes the space in between the inner 
and outer bacterial membrane. Periplasmic chaperones are involved in the process of correct 
assembly of intact pili structures. When used herein, the use of the term **chaperone" 
designates a molecular, periplasmic chaperone unless otherwise indicated. 
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The phrase "preventing or inhibiting binding between pilus subunits and a periplasmic 
chaperone" indicates that the normal interaction between a chaperone and its natural ligand, 
i.e, the pilus subunit, is being affected either by being inhibited, expressed in another 
manner, or reduced to such an extent that the binding of the pilus subunit to the chaperone is 
measurably lower than is the case when the chaperone is interacting with the pilus subunit at 
conditions which are substantially identical (with regard to pH, concentration of ions, and 
other molecules) to the native conditions in the periplasmic space. Measurement of the 
degree of binding can be determined in vitro by methods known to the person skiUed in the 
art (microcalorimetry, radioimmunoassays, enzyme based immunoassays, etc.). 

The phrase "preventing or inhibiting binding between pilus subunits" generally 
indicates that the normal interaction between pilus subunits is being aflFected either by being 
inhibited, expressed in another manner, or reduced to such an extent that the binding of a 
pilus subunit to another pilus subunit is measurably lower than is the case when the pilus 
subunits are interacting at conditions which are substantiaUy identical (with regard to pH, 
concentration of ions, and other molecules) to the native conditions during pilus assembly. 
This phrase can apply to the dissociation of pre-foimed pilus subunit-subunit interactions 
during pilus assembly. Measurement of tiie degree of binding can be determined in vitro by 
methods known to the person skiUed in the art (microcalorimetry, radioimmunoassays, 
enzyme based inununoassays, etc.). 

The compounds and compositions of the present invention which prevent or inhibit 
binding between pilus subunits or between a pilus chaperone or subunit are said to exhibit 
"antibacterial activity," 

By the term "subject in need thereof* is in the present context meant a subject, which 
can be any plant or animal, including a human being, who is infected with, or is likely to be 
infected with, tissue-adhering pilus-forming bacteria which are believed to be pathogenic. 

By tiie term "an effective amount** is meant an amount of the substance in question 
which will in a majority of patients have either the effect that tiie disease caused by the 
pathogenic bacteria is cured or ameliorated or, if the substance has been given 
prophylactically, the effect that the disease is prevented from manifesting itself. The term "an 
effective amount" also implies that the substance is given in an amount which only causes 
mild or no adverse effects in die subject to whom it has been administered, or that the adverse 
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effects may be tolerated from a medical and phannaceutical point of view in the light of the 
severity of the disease for which the substance has been given. 

As used herein, "treatment" includes both prophylaxis and therapy. Thus, in treating a 
subject, the compounds of the invention may be administered to a subject already harboring a 
bacterial infection or m order to prevent such infection from occurring. ; 

By the term "a mimic of a pilus subunit" is meant a compound which has been ' 
established to bind to a chaperone or to another pilus subunit in a manner which is 
comparable to the way the pilus subunit binds to the chapeiDne or to the way that the pilus 
subunits bind to each other, respectively. 

The terms "an analogue of a G, beta-stiand of a periplasmic chaperone" or "a mimic 
of a G, beta-strand of a periplasmic chaperone" denotes any substance which mimics or has 
the ability to bind to at least one pilus subunit in a manner which corresponds to the binding 
of a chaperone to a pilus subunit in the periplasmic space. Such an analogue or mimic of the 
ch^ne can be a modified form of the intact chaperone (e.g. one of the two domains of 
PapD) or it can be a modified form of the chaperone which may e g. be cotq)led to a probe, 
marker or another moiety. Another such analogue or mimic can be obtained by modifying or 
mutating the G, beta strand of the periplasmic ch^erone so that it diflFers from the wild-type 
sequence by the substitution of at least one amino acid residue of the wild-type sequence with 
a different amino acid residue and/or by the addition and/or deletion of one or more amino 
acid residues to or from the wild-type sequence. The additions and/or deletions can be from 
an internal region of the wild-type sequence and/or at either or both of the N- or C-termini. In 
the present context, the pilus subunit. mimic or analogue thereof exhibits at least one binding 
characteristic relevant for the assembly of pili. 

In the present context the tenns "an analogue of a pilus subunit" and "a mimic of a 
pilus subunir should be understood, in a broad sense, to mean any substance which mimics 
(with respect to binding characteristics) an effective part of a pilus subunit (e.^. the amino- 
tenninal portion of the pilus subunit). Thus, the analogue or mimic may simply be any other 
compound regarded as capable of mimicking the binding between pilus subunits in vivo or in 
vitro. In the present comext, the pilus subunit. mimic or analogue thereof exhibits at least one 
binding characteristic relevant for the assembly of pili. 
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In the present context the terms "a mannose analogue" or "a mannose mimic" should 
be understood, in a broad sense to mean any substance which mimics (with respect to bindmg 
characteristics) the mannose sugar which binds to an cflFective part of the FimH adhesin {e.g., 
the NH2 terminal mannosc-binding domain). Thus, the analogue or mimic may simply be any 
other compound regarded as capable of mimicking the binding of a mannose-oligosacchaiide 
to FimH adhesin in vivo pt in vitro. In the present context, the mannose analogue or mannose 
mimic exhibits at least one binding characteristic relevant for the adhesion of pili. 

The tenm "donor stand complementation" refers to the mechanism by which a 
chaperone donates its G, beta-strand to complete the fold of a pilus subunit. 

The term "donor strand exchange" refers to the mechanism by which the amino- 
terminal extension of a pilus subunit displaces the G, beta-strand of a pUus chaperone and 
subsequently occupies the subunit groove previously occupied by the G, beta-strand. 

The term "crystallized PapD-Pq)K chi5)erone-subunit co-complex" refers to a 
polypeptide co-conq>I« having an amino acid sequence as set out in SEQ ID NO: 1 and SEQ 
ID NO: 12 and which is in crystalline form. 

The term "crystal" refers to a composition conq)rising a polypeptide in crystalline 
form. The term "crystal" includes native crystals, heavy-atom derivative crystals and co- 
crystals, as defined hoein. 

The term "native crystal" refers to a crystal wherein the polypeptide is substantially 
pure. As used herein, native crystals do not include crystals of polypeptides comprising 
amino acids that are modified with heavy atoms, such as crystals of selenomethionine 
mutants, selenocysteine mutants, etc. 

The term "heavy-atom derivative crystal" refers to a crystal wherein the polypeptide is 
in association with one or more heavy-metal atoms. As used herein, heavy-atom derivative 
crystals include native crystals into which a heavy metal atom is soaked, as weU as crystals of 
selenomethionine mutants and selenocysteine mutants. 

The term "co-complex" refers to a polypeptide in association with one or more 
additional polypeptides or other molecules. For example, the PapD-PapK and FimC-FimH 
assemblies are co-complexes. 
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The term "co-cxystal" refers to a composition comprising a co-complex, as defined 
above, in crystalline form. Co-crystals include native co-ciystals and heavy-atom derivative 
co-crystals. 

The term "unit cell" refers to the smallest and simplest volume element (/.c, 
parallclpiped-sh^ed block) of a crystal that is completely representative of the unit or pattern 
of the crystal The dimensions of the unit ceU are defined by six numbers: dimensions a. b 
and c and angles dt. P and y (Blundel et aL, 1976, Protein Crystallography, Academic Press.). 
A crystal is an efSciently packed array of many unit cells. 

The phrase "having substantially the same three-dimensional structure" refers to a 
polypeptide that is characterized by a set of atomic structure coordinates that have a toot 
mean square deviation (r.ms.d.) of less than or equal to about 2 A when superimposed onto 
the atomic structure coordinates of Tables 4 or 5 when at least about 50% to 100% of the Ca 
atoms of the coordinates are mcluded in the superposition. 

Detafled Descriptfoii of thi>Tnv.>|.»|^^ 
In accordance with the present invention, applicants have designed and fabricated 
compounds mimic components of chaperones such as PapD and pilus subunits such as 
P^K, and which thereby function to interfere with pilus assembly. Specifically, applicants 
have devised compounds and methods which interfere with the binding of a chaperone or a 
pilus subunit to a pilus subunit which will thus interfere with the formation of intact piU, 
thereby reducing the capacity of bacteria to adhere to host epithelium. Further, applicants 
have devised compounds which interfere with the adhesion of FimH adhesin to mannose 
oligosaccharides located on the host epithelium thereby reducmg the capacity of piliated 
bacteria to attach to and infect host tissues. Applicants have further demonstrated that 
prevention or inhibition of pilus assembly in Gram-negative pathogens can be accomplished 
in a munber of ways. 

The co-crystal structure of PapD has been resolved and refined to a 2.0 angstrom 
resolution, revealing a molecule with two immunoglobulin-like domains oriented in an L 
shape to form a clefk at their interfece. See A. Hohngren and C.E. Brenden, Nature, 342:248 
(1989). The chaperone cleft contains surfiice-exposed residues that are highly conserved. 
Each immunoglobulin-like domain has a beta-barrel structure formed by two antiparallel 
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beta-pleated sheets with an overall topology similar to an immunoglobulin fold Applicants 
have resolved the co-crystal structure of the PapD-PapK ch^erone-subunit co-complex 
which reveals how PapD stabilizes pilus subunits in the periplasm. Further, a combination of 
genetic, biochemical, and crystallographic data has demonstrated that the Gj beta-strand of 
PapD fomis a beta-zipper interaction with the highly conserved COOH-terminal motif of 
pilus subunits. See Himg, et aL., EMBO J. 15:3792 (1996); Kuebn^^ 
(1993); Soto et al., EMBOJ. 17:6155 (1998). This COOH-tenninal motif also comprises at 
least part of a primary surface for subunit-subunit assembly interactions, indicating that the 
direct cs^ping of a primary assembly surface is part of the molecular basis by which 
periplasmic cluqpecones prevent the premature oligomerization of pilus subunits. In addition, 
it is believed that flie beta-zipper interaction fecilitates the folding of the subunit into a native- 
like conformation via a template-mediated mechanism. 

Applicants have solved the three dimensional co-crystal structure of a FimC-FimH 
chaperone-adhesin co-complex from uropathogcnic E. coli. See Choudhury et al.. Science 
285: 1061 (1999). This molecular mechanism is supported by this stmcture. Specifically, 
applicants have demonstrated that in the FimC-FimH co-complex, the seventh (Gj) strand 
from the NHj-terminal domain of the chaperone is used to complement the pilin domain 
between the second half of the A strand and the F strand of the domain. As such, the F strand 
of FimH forms a parallel beta-strand interaction with the Gj beta-strand of FimC and has its 
COOH-terminal caxboxyl group anchored in the crevice of the ch^erone cleft of FimC. 

Thus, applicants have elucidated the mechanism of binding between Ps^D and the 
pilus subunit PapK, thereby identifying an essential part of a defined binding site responsible 
for the bindmg between pilus subunits as well as binding between pilus subunits and their 
periplasmic chaperones. Furthermore, applicants have utilized the PapD-PapK co-crystal 
structure, the first of such a co-complex, and the FimC-FimH co-crystal structure to provide 
further insights into the processes of subunit folding, capping, and assembly in the 
chaperone/usher pathway of pilus biogenesis, and thereby devised compounds, compositions 
and methods for the prevention and inhibition of pilus formation. 

Furthermore, applicants have elucidated the mannose binding domain of the FimH 
adhesin which is responsible for mediating the binding of pili to marmose receptors on host 
cells. As demonstrated further in the examples, a pocket capable of accommodating a mono- 
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mannose unit is located at the tip of the lectin domain of the FimH adhesin. Applicants have 
utilized the identification of this mannose-binding site to design compounds and 
compositions which would function to interfere with pilus attachment to epithelial tissues 
thereby inhibiting or preventing the ability of the bacterium to infect host tissues. 



PapD-PapK Chaperone- Subunit Cn.r.nr^p ]^^ 

An important aspect of the PapD-PapK chaperone-subunit co-complex is the structure 
of the PapK subunit PapK has an immunoglobulin-Iike fold; however, it lacks the canonical 
seventh beta-strand and in its place is a deep groove located on the surface of the PapK 
subunit The base of the groove on the surfece of the PapK subunit is formed by the 
hydrophobic core of the protein. From the resolved co-crystal structure of the PapD-PapK 
chaperone-subunit co-complex, it can be seen that the G, beta-strand of die chaperone 
occupies this groove and prevents die exposure of die hydrophobic core of the subunit, which 
would lead to the destabilization and degradation of die subimits. 

Moreover, the PapD-PapK chaperone-subunit co-complex provides further insight 
into the mechanism by which pilus subunits assemble to form a mature, intact.pUus. The 
eight amino acids located on the amino-terminus of PapK are disordered and presumably 
project away from the co-complex. These residues contain a pattern of alternating 
hydrophobic residues typical of a beta-stond which is conserved in pilus subunits. Thus, 
while not being bound to a particular theory, it is beUeved that in the mature pilus, the ammo- 
terminal residues of one subunit occupy the groove of the adjacent subunit. 

In the PapD-PsqaK co-complex structure, strand F of PapK forms one side of die 
groove into which the G, beta-strand of the ch^erone is inserted and is likely to assume the 
same structural role in pilins. Structural, biochemical and genetic data have demonstrated 
that strand F (and hence the groove) in pilins is involved in both ch^erone-subunit and 
subunit-subunit interactions. By donating a secondary strucmral element to the fold of the ' 
pilin, die chaperone not only contributes to the stability of the pilin but also prevents odier 
pilins in the periplasm from binding to the groove of the ch^erone-bound subunit 

The amino-terminal region of pilins, corresponding to die disordered amiiio-terminus 
of PapK, has also been shown to form an assembly surface on the pilin. The eight NH,- 
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terminal residues are disordered in the PapD-PapK co-complex and protrude away from the 
main body of the co-crystal structure where they would be ftee to interact with the groove of 
the preceding subunit located at the usher. The amino-terminus of an incoming subunit 
inserts into the groove of the preceding subunit, displacing the G, beta-strand of the 
chaperone in a mechanism that is facilitated by the usher. AppUcants refer to this mechanism 
as "donor strand exchange". Donor strand exchange implies that in the pilus, the NH,- 
terminal strand of one subunit would complete the inununoglobuUn-like fold and protect the 
hydrophobic core of the.preceding subunit, much as the chaperone does in the periplasm. 

A donor strand exchange model for pHus assembly canploying a P^K structure was 
utilized to model a PapA pilus rod. Pilus rods are well-ordered helical structures with a 
diameter of 68 A, a pitch of 24.9 A. and 3.28 subunits per turn. The disordered NHj-terminus 
of P^K was modeled as a beta-straiid protruding from the Ig fold at an angle consistent with 
the ordered portion of flie NHj-terminus in the structure, and inserted into the groove of the 
preceding subunit A pUus rod with the appropriate general features and without steric 
clashes could be built by applying identical translational and rotational operations to 
successive subunits. The model pilus has a 72 A diameter, a pitch of approximately 22 A, 
and approximately 3.3 subunits per turn, similar to the actual dimensions of the pilus rod 
(Fig. 7). However, the model has an imexpected feature: the NH2-terminal strand of one 
subunit runs antiparallel (not parallel as does the G, beta-strand of PapD) to strand F of the 
preceding subunit A parallel beta-strand interaction with strand F of the preceding subunit 
would produce a rod with a star-shaped cross-section (Figs. 7A and 7B), inconsistent with the 
electron microscopy data. Thus, while donor strand complemoitation with the diapaone 
results in an atypical immunoglobulin fold, donor strand exchange between subunits produces 
a canonical variablenregion immunoglobulin fold in the mature pilus. . 

FimC-Fi mH chaperone-adhesin co-complex 

Further evidence illustrating donor strand complementation is provided by the 
resolution of the co-crystal structure of the FimC-FimH chaperone-adhesin co-complex from 
uropathogenic E. coli. See Choudhury, et al.. Science 285: 1061 (1999). The FimC-FimH 
ch^erone-adhesin co-complex structure also reyeals a donor strand compl«nentation 
mechanism that e7q>lains the basis of both chaperone hmction and pilus biogenesis. 
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The FimH adhesin subunit is folded into two domains of the all-beta class, a NHj- 
terminal mannose-binding domain and a COOH-teiminal pilin domain. A short extended 
linker (residues 157H - 159H) connects the two domains. The NHj-tenninal mannose- 
binding domain comprises residues IH - 156H, and the COOH-terminal pilin domain which 
is used to anchor the adhesin to the pilus comprises residues 1 60H - 279H (Figure 8 A). The 
pilin domain of FimH binds in the cleft of the chaperone (Figure 9B) with Umited contact 
between FimH and the COOH-tenninal domain of FunC. 

The lectin domain of FimH is an eleven-stranded elongated beta-barrel with a jelly 
roU-like topology (Figure 8B). The fold starts with a . short beta haiipin that it not part of the 
jeUy roll. The final (eleventh) strand of the domain is inserted between the third and tenth 
strands and thus breaks the jelly-roU topology. A pocket capable of accommodating a mono- 
mannose unit is located at the tip of the domain, distal from the connection to the pilin 
domain (Figure 9B). The bottom of the pocket is lined with asparagine. glutamine and 
aspartic iacid residues in three loop regions which are typical caibohydiate binding side cluuns 
(Figure lOA). These residues form hydrogen bonds with C-HEGA as described in Example 3 
herein. 

The pilin domain of FimH has the same immunoglobulin-like topology as the amino- 
terminal domain of FimC, except that the seventh strand of the fold is missing (Figure 8B). 
Two anti-paraUel beta-sheets (strands A'BED* and D"CF) pack against each other to form a 
beta-bairel that is similar to, but distinct from, immunoglobulin barrels. As in the 
• chaperones, strand switching occurs at the edges of the sheets. In the chaperones, the Al 
strand of the amino-terminal domain switches between the two sheets of the barrel. The first 
strand of the pilin domain exhibits a similar switch, but due to the lack of a seventh strand, 
the second half of the A strand is not involved in main chain hydrogen bonding within the 
domain. The D strand of the chaperones as well as of the FimH pilin domain also switches, 
but in the pilin domain the switch is an eight-residue loop instead of the cis-ptoline bulge 
found in the chaperones. The C-D loop and the D'-D" connection pack against each other 
and close the top of the barrel. The other side of the bancl, defined by the A and F edge 
strands, is open. Due to the absoice of a seventh strand a deep scar is created on the surfece 
of the domain. Residues that would be part of the hydrophobic core of an intact, seven 
stranded PapD-Iike domain instead line a deep hydrophobic crevice on the surface of the pilin 
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domain (Figure lOB). 

As mentioned herein, the donor strand complementation mechanism refers to the 
chaperone donating its G, beta-strand to complete the fold of the pilin domain. The G, beta- 
strand of periplasmic ch^erones such as FimC and P^D contains a conserved motif of 
solvent-exposed hydrophobic residues at positions 103, 105 and 107. In the chaperone- 
subunit co-complex, the G, beta-strand containing these alternating hydrophobic residues are 
used to complete the unfinished hydrophobic core of pilus subunits such as FimH and PapD. 
Thus, in the FimC-FimH co-complex, these hydrophobic residues are used to complete the 
unfinished hydrophobic core of FimH which results &om the missing seventh strand. 
SpecificaUy. the seventh (G,) strand from the NHj-tenninal domain of &e FimC chqjerone 
complements die FimH pilin domain by being inserted between the second half of the A 
strand and the F strand of the domain (Figure IOC). Leu'"*<= and Leu"'**^ are deeply buried in 
the crevice in the FimH pilin domain. Leu'"*^ of FimC contacts residues He'"", Val^", 
Leu^ and De^" of FimH. Leu"^*= of FimC is in contact with ne'"« Leu"'", Ile^'^", and 
Val""" of FimH, He"" is closer to the FimH pilin domain surface but mades van der Waals 
contacts with residues Val*«" and Phe"*". The final strand (F) of FimH fonns a paraUel beta- 
strand interaction with the G, beta-strand of FimC and has its COOH-terminal caiboxylate 
group anchored in the crevice of the chaperone clefk throu^ hydrogen bonding with the 
conserved residues Aig»^ and Lys'"'^ in FimC (Figure 9A). This interaction is critical for 
chaperone function. 

Furdiermore, the two conserved moti& of FimH (the COOH-terminal F strand and an 
amino-tominal motif) participate m subunit-subunit interactions necessary for pilus 
assembly. See GJE. Soto et al., EAfBOJ., 17: 6155 (1998). An alignment of the pilin 
sequences demonstrates that the amino-taminal motif of FimC was part of a 10-20 residue 
NHj-terminal extension that was missuig in the FimH pilin domain (Figure 8 A) and 
disordered in the PapD-PapK co-complex as discussed above. This region contains a highly 
conserved pattern of alternating hydrophobic residues (highlighted in Figure 8A) similar to • 
the donor G, beta-strand of the ch^erone. Applicants believe that the amino-terminal 
extension of the FunH subunit is structurally analogous to the donor G, beta-strand motif of 
the chaperone and thus, would fit into the pilin groove occupied by the donor G, beta-strand 
of the chaperone. 
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However, the type 1 pilus is a right handed helix with about 3 subunits per turn, a 
diameter of approximately 70 A, a central pore of about 20-25 A, and a rise per subunit of 
about 8 A. Thus, in order to obtain this structure, the insertion of the NHj-teiminal extension 
must be antiparallel to strand F in contrast to the parallel insertion observed for the G, beta- 
strand of the chaperone. Ihsation in a parallel orientation would lead to rosette-like 
structures. One edge of the pilin groove is lined by the COOH-tenninal F strand and fonns a 
critical part of the subunit tail. Thus, without being bound to any theory. Applicants believe 
that the amino-terminal extension represents the head of a subunit and during pilus 
biogenesis, the amino-terminal extension would displace the donor G, beta-strand of flie 
chaperone to fit into the tail groove of a neighboring subunit to complete the pilin fold of its 
neighbor in a donor strand complementation mechanism. 

AppUcants constructed a model for the type 1 pilus using the FimH pilin domain as a 
model for FimA (Figure 1 1). Each subunit was aligned to have its cleft facing towards the 
center of the pilus so that the height firom the top to the bottom of the domain along the helix 
axis was approximately 25 A. Applying a rotation of 115 degrees and a rise per subunit of 8 
A, a hollow helical cylinder is created- The outer diameter of this cylinder as measured across 
Ca atoms is 70 A, and the inner diameter is 25 A. FimA subunits from different strains ofE, 
coli exhibit considerable allelic variation. The vast majority of the variable positions are on 
the outside surface of the pilus model described above (Figure 1 1) which would account for 
the antigenic variability of type 1 pili. 

The head-to-tail interaction between subunits in a pilus is i«niiiiscent of 
oUgomerization through three dimensional domain sw£q)ping in the sense that a part of the 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g. ^ FimG, FimF and FimH in the pilus tip. Furthermore, because 
individual pilins promoters do not racist as stable monomers, there is no exchange of 
structural units between a monoraeric and an oligomeric state. Instead, a different protein, the 
periplasmic chaperone, is needed to keep the monomeric subunits in solution by donating a 
unique part of its strucmre (the G, beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex and without being limited to 
any theory, it is believed that piUns are missing necessary steric information needed to fold 
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into a native three dimensional stnicture. The information that is missing consists of the 
seventh edge strand of an immunoglobulin fold. This strand, which is necessary for folding, 
is donated to the hydrophobic core of the pilin by the periplasmic chaperone in a donor strand 
complemoitation mechanism. 

Applicants further utilized the co-crystal structure of the FimC-FimH chs^erone- 
adhesin co-complex to identify the anino-terminal mannose-binding domain of FimH, an 
essential component required for pilus adhesion to host tissues. As discussed above, the 
bottom of this mannose-binding domain is lined with asparagine, glutamine and aspartic acid 
residues and those skilled in the art would be able to use molecular modeling techniques and 
other existing protocols to design and synthesize antibacterial compounds. Such compounds 
would compete with mannose for binding to the FimH adhesin thereby preventing or 
inhibiting pilus adhesion to host epithelium. 

Thus, ^licants utilized the discovery of this molecular mechanism of protein 
binding to identify an essential part of a defined binding site responsible for pilus assembly 
and adhesion. Further, applicants have utilized this structure to design and fabricate methods 
and compounds to compete with the chaperone for binding to the exposed binding site of the 
pilus subunit thereby inhibiting pilus assembly and reducing the pathogenicity of piliated 
Gram-negative bacterium. Such a compound is useful in treating bacterial diseases or in 
preventing costly biofilm formation in medical, industrial and various other settings. 

Peptide compounds 

Thus, the present invention is directed to compounds which mimic the cs^ability of a 
periplasmic chaperone or of a pilus subunit to bind to the groove of a pilus subunit, thereby 
preventing or inhibiting pilus biogenesis by interfering with the normal function of these 
biological components. Specifically, applicants have shown that prevention or inhibition of 
the bmding between pihis subunits and between pilus subunits and periplasmic chaperones 
can be accomplished in a number of ways. 

In a preferred embodiment of the invention, the compounds are peptides or peptide 
analogs that are capable of disrupting the assembly of pilus subunits and/or binding the cleft 
of a pilus subunit that is bound by the G, beta-strand of another pilus subunit in an assembled 
pili structure and comprise a core sequence of residues preferably derived from a conserved 
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N-tenninal region of a pilus subunit As will be apparent fiom alignments of the conserved 
N-terrainal regions of the various pilus subunits, such peptides and peptide analogs will 
typically comprise at least two alternating hydrophobic amino acids. The core sequence of 
such peptides and peptide analogs may be derived from the amino-tenninal sequence of any 
of a number of pilus subunits, including but not limited to, PapA, PrsA, FimA, AfaA, FocA, 
HifA, HafA, Fim2, Fim3, MipA, Pmfi^ LpfA, PefA, ArfA, PapK. PrsK, PapH. PrsH, PapE, 
PrsE, MrpB. SfeG, SfaS, FocG, FocF, PapF, PrsF, MprF, MrpE. F17A. FanC. FaeA, MikA 
and RalC. Typically, the core sequence is composed of about 3 to about 12 residues, 
preferably 5 to 9, most preferably 7 residues. The core sequence may correspond identically 
to the sequence of a pilus subunit, or it may include one or more substitutions, preferably 
conservative substitutions, and/or insotions and/or deletions. 

Moreover, the core sequence may be flanked at either of both of its N- and/or C- 
teraiini by residues of random sequence (Le., sequences that do not necessarily correspond to 
the pilus subunit from which the core sequence is derived). When included, such flanking 
residues should not significantly alter die ability of the core sequence to disrupt subunit 
assembly. Thus, typically the compounds of the invention will include fewer than 5 flanking 
residues at each terminus, preferably fewer than 3 flanking residues, and most preferably no 
flanking residues. 

Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 
example, the peptide or peptide analog may include a core sequence derived &om FapA 
flanked at one or both termini with sequences derived from FimA. Alternatively, the peptide 
or peptide analog may include a core sequence o^ for example 10 residues, some of which 
are, for example, derived from PapA and the rest of which are, for example, derived from 
FimA. 

In one illustrative embodiment, the compounds are 10 to 20 residue peptide and/or 
peptide analogs comprising formula (I): 

or a pharmaceutically-acceptable salt thereof, wherein: 

X, is any amino acid residue, preferably other than a basic residue; 
X2 is any amino acid residue, preferably other than a aliphatic residue; 
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X, is a hydrophobic residue, preferably an aliphatic residue or a hydroxyl- 
. substituted aliphatic residue; 

X* is any amino acid residue, preferably other than an acidic residue; 

X, is a hydrophobic residiie or Gly; 

X5 is a hydrophobic or a hydrophilic residue; 

X7 is a hydrophobic residue, preferably Gly, an amide-substituted polar residue 
or an aliphatic residue, and most preferably Gly; 

Xg is any amino acid residue, preferably other than an aliphatic residue; 
X, is an aliphatic residue; and 

X,o is any amino acid residue, preferably a hydrophobic residue, more 
preferably an aliphatic residue or a polar residue. 

In the conq)ounds comprising foimula (I), the symbol between residues X„ 
generaUy designates a backbone constitutive linking function. Thus, when the compounds 
are peptides, the symbol "-" represents a peptide or amide linkage (-C(0)NH-). It is to be 
understood, however, that fonnula 00 includes peptide analogs in which one or more amide 
linkages is optionally replaced with a linkage other than amide linkage, preferably a 
substituted amide or an isostera of amide linkage. Thus, while the various X. residues within 
fomiula (I) may conveniently be described in terms of "amino acids" or "residue," those 
having skill in the art will recognize that in embodiments having non-amide linkages, the 
tenu "amino acid" or "residue" as used herein refers to other bifimctional moieties bearing 
side-chain groups similar in structure to the side chains of the amino acids. 

Substituted amide linkages generally include, but are not limited to, groups of the 
foimula -C(0)N(R)-, where R is (C,-C^ alkyl, substituted (C,-C«) alkyl, (Q-Cs) alkenyl, 
substituted (Q-CJ alkenyl, (Cj-CJ alkynyl, substituted (C,-C«) aBcynyU (Cj-Cjo) aiyl, 
substituted (C5-C20) aryl, (Q-C,^ aiylalkyl, substimted (Q-Cjs) aiylalkyl, 5-20 membered 
heteroaiyl, substituted 5-20 membered heteroaryl* 6-26 membered heteroaiylalkyl and 
substituted 6-26 memboped heteroarylall^L 

Isosteres of amide linkages graerally include, but are not limited to, -CHjNH-, 
-CHjS-, -CHjCH,-, -CH=CH- (cis and trans). -C(0)CH,-, -CH(OH)CH2- and -CH2SO-. 
Compounds having such non-amide linkages and methods for preparing such compounds are 
well-known in the art (sge, sa» Spatola, March 1983, Vega Data Vol. 1, Issue 3; Spatola, 
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1983, "Peptide Backbone Modifications" In: Chemistry and Biochemistry of Amino Acids 
Peptides and Proteins, Wemstein, ed.. Marcel Dekker, New York, p. 267 (general review); 
Morley, 1980, Trends Pharm. Sci. 1:463-468; Hudson et aL, 1979, Int J. Piot Res. 14:177- 
185 (-CHjNH-, -CHjCH,.); Spatola et al., 1986, Life Sci. 38:1243-1249 (-CH2-S); Hann, 
1982, J. Chem. Soc. Perkin Trans. L 1:307-314 (-CH=C:H-, cis and trans); Almquist et aL, 
1980, J. Med. Chem. 23:1392-1398 (-COCHj-); Jennings- White et al. Tetrahedron. Lett 
23:2533 (-COCHj-); European Patent Application EP 45665 (1982) CA 97:39405 
(-CH(OH)CH2-); HoUaday et al, 1983, Tetrahedron Lett. 24:4401-4404 (-C(0H)CH2-); and 
Hruby, 1982, Life Sci. 31:189-199 (-CHj-S-). 

Additionally, one or more amide linkages can be replaced with peptidorxiimetic or 
amide mimetic moieties which do not significantly interfere with the structure or activity of 
the peptides. Suitable amide mimetic moieties are described, for example, in Olson et al., 
1993, J, MeA Chem. 36:3039-3049. 

Compounds comprising formula (I) that are peptide analogs may provide! significant 
ther£q>eutic advantages, as their non-peptide interlinkages may confer the compound with 
enhanced stability towards proteases and/or peptidases, thereby confcrrii^ the corrqx)unds 
with increases in vivo stability compared to a corresponding peptide. 

The various residues Xj through may be selected fix)m amongst the genetically 
encoded amino acids, as well as fi:om genetically non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains activity. 
Compounds including D-amino acids may have enhanced in vivo stability. Preferably, all of 
residues X, through Xjo are in the L-configuration. 

The peptides and peptide analogs of formula (I) may optionally include, in addition to 
the sequence defined by residues X, through X,o, a 1 to 5 residue peptide or peptide analog at 
either or both termini. Peptide analogs typically contain at least one modified interlinkage, 
such as a substituted amide or an isostere of an amide, as described above. Such additional 
peptides or peptide analogs may have an amino acid sequence derived firom a pilus subunit or, 
alternatively, their sequences may be completely random. Compounds including such 
random sequences may be tested for biological activity in the various assays and methods 
described in a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
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genetically encoded or non-encoded, and may be in either the D- or L-configuration. In one 
embodiment, when the sequence defined by formula (I) is a peptide, one or both tennini are 
"capped" with 1 to 5 residue peptides composed wdioUy of D-amino acids that serve to protect 
the core sequence torn degradation in vivo by proteases and/or pq)tidases. 

Also included within the scope of the present invention are "blocked" forms of the 
peptides and peptide analogs including fonnula (I), Le., 10 to 20 peptides and/or peptide 
analogs in which the N- and/or C-teiminus is blocked with a moiety enable of reacting with 
the N-teraiinal -NHj or C-tenninal -C(0)OH. Such blocked compounds are typcially 
N-tenninal acylated and/or C-terminal amidated or esterified. Typical N-terminal blocking 
groups include R'C(0)-. where R' is hydrogen, (C.-Cj) alkyl, (Cj-CJ alkenyl, (Q-CJ 
alkynyl, (Cj-Qo) aryl, (Q-C,,) arylalkyi, 5-20 membered heteroaryl or 6-26 membered 
heteroarylalkyl. Preferred N-temiinal blocking groups include acetyl, foimyl and dansyl. 
Typical C-terminal blocking groups include -C(0)NR'R' and -C(0)OR', where each R' is 
independently as defined as above. Preferred C-temiinal blocking groups include those in 
which each R' is independently (C,-C^ alkyl, preferably methyl, ethyl, propyl or isopropyl 
Preferred amongst the 10 to 20 residue peptides and/or peptide analog comprising 
fonnula (I) are those compounds having one or more or the following characteristics: 
X3 is an aliphatic residue or T; 
X, is an aliphatic residue, F or G; and/or 
X7isG,HorA. 

Particularly preferred are the 10-residue peptide described in Table 2, below. 
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'^^^^^2: SUBUNIT N-TERMINAL-MOTIF-DERIVED PEPTIDES 



AMINO ACID SEQUENCE 


PILUS SUBUNIT 


GKVTFNGTW (SEQ ID NO: 2) 


PapA,PrsA 


GTVHFKGEW (SEQ ID NO: 3) 


FimA, SfaA, FocA 


GKVTFFGKW (SEQ ID NO: 4) 


Hi£A,Ha£A 


GTIVITGTIT (SEQ ID NO- 5) 


PiTn'7 i 
x*lillx 1 


GTIVITGSIS (SEQ ID NO: 6) 


X^UUa> 1 


GTVKFVGSn (SEQ ID NO: 7) 


lYlip/\ 1 


GEIQLKGEIV (SEOIDNO S) 


xtUlA 1 


GTKFTGEIV (SEQ ED NO- 9^ 


T nfA 1 


NEVTFLGSVS (SEO ID NO- \G\ 


PaFA 1 

i^eiA J 


GTINFEGS W (SEO ID NO- 1 1 ^ 


A f f A 1 


SDVAFRGNLL fSEO ID NO- 12^ 


rapK^ rrsK 1 


GRAAFHGE W f SEO ID NO- 1 1^ 




GRATFHGEW fSEO ID NO- 14^ 


rTSrl 1 


DNLTFRGKLI (SEO ID NO- 15^ 


i^apxs j 


DNLTFKGKLI (SEC ID NO- 16^ 


PrcT? 1 

xTSC 1 


GWLNLQGTIL (SEO ID NO- ITi 




SWNITGNVO (SEO ID NO- 18^ 


cfu/:; 1 


IHTVTGNVL (SEO ID NO- IQ"! 


CfoC 1 


TTITVTGRVL (SEO ID NO- 20^ 


rocvj 1 


CMLAGSNFVT (SEQ ID NO: 2 1) 


FocF 


VQINIRGNVY (SEQ ID NO: 22) 


PapF, PrsF 


PNLKLFGTLL (SEQ ID NO: 23) 


MrpF : 


VYINITGNVI (SEQ ID NO: 24) 


MrpE 


GKlTFNGKW(SEQIDNO:25) F17A ~1 
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GTINFNGKIT (SEQ ID NO; 26) 


FanC 


QKTEFSAD W (SEQ ID NO: 27) 


FaeA 


GQVNFFGKVT (SEQ ID NO: 28) 


MrkA 


QRTHTAD W (SEQ ID NO: 29) 


RalC 



In a preferred embodiment of the invention, the compounds are peptides or peptide 
analogs that mimic the binding activity of the G, beta-strand of a chaperone and that exhibit 
antibacterial activity against a Gram-negative bacterium. The core sequence of such peptides 
and peptide analogs may be derived fix>m the G, beta-strand of any of a number of 
ch^erones, including but not limited to, PapD, MrpD, FanE, SfaE, FaeE, MrkB, HifB, F17D, 
FimC, FimB, Pe£D, EcpD, ClpE, YehC, PmfF, FocC, LpfB, SefB, CaFlM, CS3-1, CsaB. 
Myffl, AggD, CssC, I>?faA and A&B. Typically, the core sequence is composed of about 3 to 
about 12 residues, preferably fiom 4 to 9 residues and most preferably 7 residues. The core 
sequence may correspond identically to the G, beta-strand sequence of a chapetone, or it may 
include one or more substitutions, preferably conservative substitutions, and/or insertions 
and/or deletions. 

Moreover, the core sequence may be flanked at eiOier of both of its N^ and/or C- 
tomini by residues of random sequence (i.e., sequences that do not necessarily correspond to 
the G, beta-strand from which the core sequence is derived). When included, such flanking 
residues should not significantiy alter the ability of the core sequence to mimic the binding 
activity of the G, beta-strand of a chaperone. Thus, typically the compounds of the invention 
will include fewer than 5 flanking residues at each teraoinus, preferably fewer than 3 flanking 
residues and most preferably no flanking residues. 

Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 
example, the peptide or p^tide analog may include a core sequence derived from the G, beta- 
strand of a PapD chaperone flanked at one or both termini with sequences derived fiom an 
MtpD chaperone. Alternatively, the peptide or peptide analog may include a core sequence 
ot, for example 7 residues, some of which are, for example, derived from a PapD chaperone 
and tile rest of which are derived from, for example a FanE chaperone. 

In one illustrative embodiment, the compounds are 7 to 1 7 residue peptide and/or 
peptide analogs comprising formula (II): 
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Ctl) X..-X„-X,3-X„-X,r-X„-X„ 

or a phannaceutically-acceptable salt thereof, wherein: 

X„ is any amino add residue, preferably other than a basic residue; - 
X,j is any amino acid residue; 

X,3 is a hydrophobic residue, preferably an aUphatic residue or an apolar 
residue, wherein the apolar residue is preferably M; 

X„ is any amino acid residue, preferably other than an aromatic residue; 

X,5 is a hydrophobic residue, preferably an aUphatic residue; 

X„ is any amino acid residue, preferably an aliphatic residue or a hydioxyl- 
substituted aliphatic residue; and 

X„ is hydrophobic residue or a hydroxyl-substituted aliphatic residue, 
preferably an aUphatic residue. F. M or a hydroxyl-substituted aUphatic residue. 

In the compounds comprising (n), the symbol between residues X. is as 
previously defined for formula CO. 

The various residues X„ through X,, may be selected fiom amongst the genetically 
encoded amino acids, as weU as fiom geneticaUy non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains a^vity. 
Compounds including D-amino acids may have enhanced in vivo stabiUty. Preferably, all of 
residues X„ through X,7 are in the L-configuration. 

The peptides and peptide analogs of formula (II) may optionally include, in addition 
to the sequence defined by residues X.. throu^ X.,. a 1 to 5 residue peptide or peptide analog 
at either or both t«mini. Peptide analogs typically contain at least one modified interiinkage 
such as a substituted amide or an isostere of an amide, as described above. Such additional ' 
peptides or peptide analogs may have an amino acid sequence derived from the G, beta-strand 
ofachaperone or. alternatively, their sequences may be completely random. Compounds 
mcluding such random sequences may be tested for biological activity in the various assays 
and methods described in a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
geneticaUy encoded or non-encoded, and may be in eitiier die D- or L-configuration. In one 
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convenient embodiment, when the sequence defined by formula (II) is a peptide, one or both 
teimini are "capped" with 1 to 5 residue peptides composed wholly of D-amino acids that 
serve to protect the core sequence from degradation in vivo by proteases and/or peptidases. 

Also included within the scope of the present invention are "blocked" foims of the 
peptides and peptide analogs including formula (II), as previously described in connection 
with compounds comprising formula (I). 

Preferred amongst the 7 to 17 residue peptides and/or peptide analogs comprising 
formula (II) are those compounds having one or more or the following characteristics: 

X,3 is an aliphatic residue or M; . 

X„ is an aliphatic residue, F or M; and/or 

X,7 is an aliphatic residue, F, M or T. 



Particularly preferred are the 7-residue peptides described in Table 3, below. 
Table 3: CHAPERONE G, BETA-STRAND-DERIVED PEPTIDES 



AMINO ACID SEQUENCE 


CHAPERONE 


NVLQIAL (SEQ ED NO: 1) 


PapD, MipD 


GSLSLAI (SEQ ID NO: 30) 


FanE. 


NYLQFAI (SEQ ID NO: 31) 


SfaE 


SGIAVAL (SEQ ID NO: 32) 


FaeE 


NILQLAI (SEQ ID NO: 33) 


MricB 


SFMQIAI (SEQ ID NO: 34) 


HifB 


NYLQFAV (SEQ ID NO: 35) 


F17D 


NTLQLAI (SEQ ID NO: 36) 


FimC 


GVLQLTI (SEQ ID NO: 37) 


FimB 


NVLAVAV (SEQ ID NO: 38) 


PefD 


SLLQLAF (SEQ ID NO: 39) 


EcpD 


SGIAVAV (SEQ ID NO: 40) 


ClpE 


NALJCFAM (SEQ ID NO: 41) 


YehC 


NVLQMAM (SEQ ID NO: 42) 


PmfD 
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NYLQFAI (SEQ ID NO: 43) 


FocC 


NVLQIAV (SEQ ID NO: 44) 


LofB 


LNVNWT (SEQ ID NO: 45) 


SefB 


VFVQFAI (SEQ ID NO: 46) 


CaflM 


MKLNVSI (SEQ ID NO: 47) 


CS3-1 


MDIQMSI (SEQ ID NO: 48) 


PsaB 


LNILLSV (SEQ ID NO: 49) 


MyfB 


MNIQVSV (SEQ ID NO: 50) 


AggD 


DSDMISI (SEQ ID NO: 51) 


CssC . 


LNVQLS V (SEQ ID NO: 52) 


NfaA, AfaB 



Deletions of residues from either terminus of the pq>tides and peptide analogs of 
formula (I) or (II) are also contemplated to be within the scope of the invention. Such 
deletions consist of the removal of one or more amino acids of the peptide sequence, with the 
lower limit length of the resulting peptide sequence being 3 to 7 amino acids, preferably 3 to 
5 amino acids. Such deletions may involve a single contiguous or greater than one discrete 
portion of the peptide sequences. One or more such deletions may be introduced into the 
sequence, as long as such deletions result in peptides which may still bind in whole, or in 
part, to a pilus subunit and consequentially prevent or inhibit piliis biogenesis. 

It will be appreciated that by virtue of the present invention, the above-described 
poI>peptides can be synthesized using conventional synthesis procedures commonly used by 
one skilled m the art For example, the polypeptides can be chemically synthesized using an 
automated peptide synthesizer (such as one manufactured by Pharmacia LKB Biotechnology 
Co., LKB Biolynk 4170 or Milligen, Model 9050 (Milligen, Millford, MA)) following the 
method of Sheppard, et al., Journal of Chemical Society Perkin I, p. 538 (1981). In this 
procedure, N,N'-dicyclohexylcarbodiimide is added to amino acids whose amine functional 
groups are protected by 9-flourenyhnethoxycarbonyl (Fmoc) groups and anhydrides of the 
desired amino acids are produced. These Fmoc-amino acid anhydrides can then be used for 
peptide synthesis. A Fmoc-amino acid anhydride corresponding to the C«terminal amino acid 
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residue is fixed to Ultrosyn A resin through the carboxyl group using dimethylaminopyridine 
as a catalyst. Next, the resin is washed with dimethylformamide containing piperidine, and 
the protecting group of the amino functional group of the C-terminal acid is removed. The 
next amino acid corresponding to the desired peptide is coupled to the C-terminal amino acid. 
The deprotecting process is then repeated. Successive desired amino acids are fixed in the 
same manner imtil the peptide chain of the desired sequence is formed. The protective groups 
other than the acetoamidomethyl are then removed and the peptide is released with solvent 

Alternatively, the polyp^tides can be synthesized by using nucleic acid molecules 
which encode the peptides of this invention in an appropriate expression vector which include 
the encoding nucleotide sequences. Such DNA molecules may be readily prepared using an 
automated DNA sequencer and the well-known codon-amino acid relationship of the genetic 
code. Such a DNA molecule also may be obtained as genomic DNA or as cDNA using 
oligonucleotide probes and conventional hybridization methodologies. Such DNA molecules 
may be incorporated into expression vectors, including plasmids, which are adapted for tfie 
expression of the DNA and production of the polypeptide in a suitable host such as 
bacterium, e.g,^ Escherichia coli^ yeast cell or mammalian cell. 

It is known that certain modifications can be made without completely abolishing the 
polypeptide's antibacterial activity. Modifications include the removal and addition of amino 
acids. Polypeptides containing other modifications can be synthesized by one skilled in the 
art and compounds comprising such polypeptides may be tested for biological activity in the 
various assays and methods described in a latw section. Thus, the eSectiveness of the 
polypeptides can be modulated through various changes in the amino acid sequence or 
structure. 

Further, it should be understood that the nrnnic may be modified using methods 
known in the art to improve binding, specificity, solubility, safety, or efficacy. A necessary 
characteristic of these preferred compounds is the capability to interact with at least one pilus 
subunit during transport of these pilus subunits through periplasmic space and/or during the 
process of assembly of the intact pilus, in such a manner that pilus biogenesis is prevented or 
inhibited. The compound can be any compound, preferably a peptide, which has one of the 
above effects on pilus subunits and thereby on the assembly of an intact pilus. 

Morever, the present invention is directed to a compound which will mimic the 
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capabiKty of maimose to bind to the mannose binding site at the tip of the FimH adhesin, 
thereby preventing or inhibiting the ability of the pilus to adhere and infect host tissues. As 
discussed above, the bottom of this mannose-binding domain of FimH is lined with 
asparagine, glutamme and aspartic acid residues and those skilled in the art would be able to 
use molecular modelmg techniques and other existing protocol to design and synthesize 
antibacterial compounds. Such compounds would compete with mannose for binding.to the 
FimH adhesin thereby preventing or inhibiting pilus adhesion to host epithelium. As such, 
these compounds may be used in methods of preventing or inhibiting pili adhesion to a host 
tissue. 

The present invention also provides a mediod for inhibiting bacterial colonization by a 
Gram-negative organism. This method involves administration of a compound which will 
interfere with the binding of a chaperone to a pilus subunit, thereby preventing the assembly 
of an intact pilus structure. In a prefenred onbodiment of the invention, a method of 
preventing or inhibiting the assembly of pilus subunits is provided by interfering with, in the 
P^K pilus subunit, a binding site which is normally involved in the binding to pilus subunits 
during transport of these pilus subunits through the periplasmic space and/or during the 
process of pilus assembly. In another embodiment of the invention, a method of preventing 
or inhibiting the assembly of pilus subunits is provided by interfering with, in the FimC pilus 
subunit, a binding site which is normally mvolved in the binding to pilus subunits during 
transport of these pilus subunits througjh the periplasmic space and/or during the process of 
pilus assembly. 

Antibacterial compoimHs and nharmaceutical co mnosirions 

In another preferred embodiment of the invention, a method of preventing or 
inhibiting the assembly of pilus subunits is provided by administering an antibacterial 
compound which will mimic the c^ability of a periplasmic chaperone or a pilus subunit to 
bind to a pilus subunit. Also provided is a method of preventing or inhibiting the adhesion of 
a pilus to a host tissue by administering an antibacterial compound which will bind to a pilus 
mannose-binding domain. 

The antibacterial compositions of the present invention may be utilized to inhibit pili 
assembly and/or pili adhesion by providing an effective amount of such compositions to a 
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patient 

For use as antimicrobials for treatment of animal subjects, the compounds of the 
invention can be formulated as pharmaceutical or veterinaiy compositions. Depending on the 
subject to be treated, the mode of administration, and the type of treatment desired, e.g.^ 
prevention, prophylaxis, therapy; the compounds are formulated in ways consonant with 
these parameters. A summary of such techniques is found in Remington's Pharmaceutical 
Sciences, latest edition. Mack Publishing Co., Easton, PA. 

For administration to animal or human subjects, the dosage of the compounds of the 
invention is typically O.MOOmg/kg. However, dosage levels are highly dependent on the 
nature of the infection, the condition of the patient, the judgment of the practitioner, and the 
frequency and mode of administration. The dosage of such a substance is npected to be the. 
dosage which is normally employed when administering antibacterial drugs to patients or 
animals, Le. 1 ug - 1000 ug per kilogram of body weij^t per day. The dosage will d^end 
partly on the route of administration of the substance. If the oral route is employed, the 
absorption of the substance will be an important factor. A low absorption will have the effect 
that in the gastro-intestinal tract hi^er concentrations, and thus higher dosages, will be 
necessary. Also, the dosage of such a substance when treating infections of the central 
nervous system (CNS) will be dependent on the penneability of the blood-brain barrier for 
the substance. As is welUcnown in the treatment of bacterial meningitis with penicillin, very 
high dosages are necessary in order to obtain effective concentrations in the CNS. 

It will be understood that the appropriate dosage of the substance should suitably be 
assessed by performing animal model tests, wherein the effective dose level ED50) and 
the toxic dose level (e.^. TD50) as well as the lethal dose level (e.^. LD50 or LD|o) are 
established in suitable and acceptable animal models. Further, if a substance has proven 
efScient in such animal tests, controlled clinical trials should be performed. Needless to state 
such clinical trials should be performed according to the standards of Good Clinical Practice. 

In general, for use in treatment, tiie compounds of the invention may be used alone or 
in combination with other antibiotics such as erythromycin, tetracycline, macrolides, for 
example asdthromycin and the cephalosporins. Depending on the mode of administration, the 
compounds will be foraiulated into suitable compositions to permit facile delivery to the 
affected areas. 
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Fonnulations may be prepared in a manner suitable for systemic administration or 
topical or local administration. Systemic formulations include those designed for injection 
(e.g. , intramuscular, intravenous or subcutaneous injection) or may be prepared for 
transdermal, transmucosal, or oral administration. The fonnulation will generally include a 
- diluent as well as, in some cases, adjuvants, bufifers, preservatives and the like. 

For oral administration, the con^unds can be administered also in liposomal 
compositions or as microemulsions. Suitable forms include syrups, capsules, tablets, as is 
understood in the art. For injection, formulations can be prepared in conventional forms as 
liquid solutions or suspensions or as solid fonns suitable for solution or suspension in liquid 
prior to injection or as emulsions. Suitable excipients include, for example, water, saline, 
dextrose, glycerol and the like. Such compositions may also contain amounts of nontoxic 
auxiliary substances such as wetting or emulsifying agents, pH buffering agents and the like, 
such as, for example, sodium acetate, soibitan monolaurate, and so forth. 

It will be understood that the above-described methods comprising administration of 
substances in treating and/or preventing diseases are dependent on the identification or de 
novo design of substances which are capable of exerting effects which will lead to prevention 
or inhibition of the interaction between pilus subunits and periplasmic molecular chaperones. 
It is fmHier important that these substances will have a high chance of being therapeutically 
active. 

Thus clinical experimental trials and animal studies can be undertaken to demonstrate 
the therapeutic efficacy of p^tide mimics and analogues for preventing or inhibiting pilus 
assembly. The efficacy of such compounds can be shown using methods known in the art, 
including pilus inhibition and binding assays, specifically ELISA or hemagglutination. 

The antibacterial compositions of the present invention also have a variety of 
industrial uses, well known to those skilled in such arts, relating to their antibacterial 
properties. In general, these uses are carried out by bringing a biocidal or bacterial inhibitory 
amount of the antibiacterial compositions of the present invention into contact with a surfece, 
environment or biozone containing Gram-negative bacteria so that the composition is able to 
interact with and thereby interfere with the biological function of such bacteria. For example, 
such antibacterial compositions can be used to prevoit or inhibit biofilra formation caused by 
Gram-negative bacteria and to inhibit bacterial colonization by a Gram-negative organism. 
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Compositions may be fomiulated as sprays, solutions, pellets, powders and in other forms of 
administration well known to those skilled in such arts. 

Crystalline PapD^PanK Chaoerone-Subunit ro^ romnlex and 
FimH-FimC Chaperone-Adhesin Co-roiTip^^y 

• The present invention provides, for tihe first time, the high*resolution three- 
dimensional structure and atoinic stracture coordinates of the crystalline co-complexes of the 
PapD-PapK chaperonc-subunit as determined by X-ray aystallography. Also provided for 
usage in the methods of the present mvention is the high resolution three dimensional 
stractures and atomic stmcture coordinates for the crystalline co-complexes of the FimC- 
FimH chapCTone-adhesin as determined by X-fay crytallogr^hy. The specific methods used 
to obtain the structure coordinates are provided in the examples, injra. The atomic stmcture 
coordinates of crystalline PapD-P^K co-complex, obtained &om the co-crystal to 2.4 A 
resolution, are listed in Table 4, The atomic structure coordinates of crystalline FimC-FimH 
co-complex, obtained Scorn the co-crystal to 2.5 A resolution, are listed in Table 5. 

Additional antibacterial compounds can be modeled and synthesized utilizing the 
atomic coordinates obtained Scorn the resolution of the co-crystal structure of the P^D-Ps^K 
chaperone-subunit co-complex and die FimC-FimH chaperone-adhesin co-complex. For 
example, as discussed herein, applicants utilized the co-crystal structure of the FimC-FimH 
cluperone-adhesion co-complex to identify the NH2.terminaI mannose-binding domain of 
FimH, an essential componmt required for pilus adhesion to host tissues. As the COOH- 
terminus of pilus subimits in many tissue-adhering bacteria have been found to be highly 
conserved, it is believed that the antibacterial compounds of the present invention are capable 
of interacting with the majority of pilus subuhits and thus are useful in the treatment of 
various diseases caused by piliated bacteria. 

Thus, the invention encompasses a co-crystal of a pilus chaperone-subunit co- 
complex comprising an amino acid sequence of a G, beta-strand of a periplasmic chaperone 
and an amino acid sequence from the amino-terminal sequence of a pilus subumt. Preferably, 
the amino acid sequence of a G, beta-strand would be the NlOl to L107 amino acid region of 
a G, beta-strand of a pilus chaperone, and even more preferably, the amino acid sequence of a 
G, beta-strand would be the NlOl to L107 amino acid region of a G, beta-strand of a PapD 
chaperone and most preferably, the amino acid sequence of the G, beta-strand would be SEQ 
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ED NO: 1. Preferably, the amino acid sequence of the amino-teraiinal sequence would be 
fiom the N-tenninal sequence of a PapK subunit, and more preferably, the amino acid 
sequence of the amino-terminal sequence would be the amino acid sequence of SEQ ID NO: 
12. In a preferred embodiment, the co-crystal is a crystalline form of the polypeptides 
conresponding to the PapD-PapK chaperone-subunit co-complex. In a preferred embodiment 
of the invention, the co-crystal effectively diffiacts X-rays for the determination of the atomic 
coordinates of the pilus chaperone-subunit co-complex to a resolution of fiom about 3 
angstroms to about 2.4 angstroms or gjeater. 

Preferably, co-ciystals of the invoition comprise crystallized polypqitides 
corresponding to the wild-type P^D-PapK ch^erone-subunit co-complex. The co-crystals 
of the invention include native co-crystals in which the crystallized PapD-PapK ch^erone- 
subunit co-complex is substantiaUy pure and heavy-atom atom derivative co-crystals in which 
the crystallized P^D-P^K chaperone-subunit co-complex is in association with one or more 
heavy-metal atoms. The co-crystals from which the atomic structure coordinates of the 
crystalline co-complexes of the present invention may be obtained include native co-crystals 
and heavy-atom derivative co-crystals. Native co-crystals generally comprise substantiaUy 
pure polypeptides corresponding to the P^D-PapK co-complex in crystalline fonn. 

It is to be understood that the crystalline PapD-Pq>K co-complex fiom which the 
atomic stnicture coordinates of the invention can be obtained is not limited to the wild-type 
Pq^D-PapK co-complex. Indeed, the co-crystals may comprise mutants of the wild-type co- 
complex. Mutants of wild-type co-conq)lexes are obtained by replacing at least one amino 
acid residue in the sequences of one or both the polypeptides comprising tiie wUd-type co- 
complex with a different amino acid residue, or by adding or deleting one or more amino acid 
residues within the wild-type sequences and/or at the N- and/or C-terminus of one of both of 
tiie polypeptides comprising the wild-type co-complex. Preferably, such mutants will 
crystallize under crystallization conditions that are substantially similar to those used to 
crystallize the wild-type co-complex. 

The types of mutants contemplated by this invention include conservative mutants, 
non-conservative mutants, deletion mutants, truncated mutants, extended mutants, methionine 
mutants, selenomethionine mutants, cysteine mutants and seloiocysteine mutants. A mutant 
may have, but need not have, pilus subunit binding activity. Preferably, a mutant displays 
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biological activity that is substantially similar to that of the wild-type polypeptide. 
Methionine, selenomethione, cysteine, and selenocysteine mutants are particularly useful for 
producing heavy-atom derivative co-crystals, as described in detail, below. 

It will be recognized by one of skill in the art that the types of mutants contemplated 
herein are not mutually exclusive; that is, for example, a polypeptide having a conservative 
mutation in one amino acid may in addition have a truncation of residues at the N-tenninus, 
and several Leu or He -> Met mutations. 

Sequence alignments of polypeptides in a protem femily or of homologous 
polypeptide domains can be used to identify potential amino acid residues in the polypeptide 
sequence that are candidates for mutation. Identifying mutations that do not significantly 
interfere with the three-dimensional structure of the PapD-PapK co-complex and the FimC- 
FimH co-complex and/or that do not deleteriously affect, and that may even enhance, the 
activity of the co-complex will depend, in part, on the region where the mutation occurs. 

Conservative amino acid substitutions are well-known in the art, and include 
substitutions made on the basis of a similarity in polarity, charge, solubiUty, hydrophobidty 
and/or the hydrophilicity of the amino acid residues involved. Typical conservative 
substitutions are those in which the amino acid is substituted with a dififerent amino acid tiiat 
is a member of the same class or category, as those classes are defined herein. Thus, typical 
conservative substitutions include aromatic to aromatic, apolar to ^olar, aliphatic to 
aliphatic, acidic to acidic, basic to basic, polar to polar, etc. Other conservative amino acid 
substitutions are well known in the art It will be recognized by those of skiU in the art that 
generaUy, a total of iabout 20% or fewer, typically about 10% or fewer, most usually about 
5% or fewer, of the amino acids in the wild-type polypeptide sequence can be conservatively 
substituted with other amino acids without deleteriously affecting the biological activity 
and/or three-dimensional structtire of the molecule, provided that such substitutions do not 
involve residues that are critical for activi^, as discussed above. 

The heavy-atom derivative co-crystals firom which the atomic structure coordinates of 
the invention are obtained generally comprise a crystalline co-complex in association with 
one or more heavy metal atoms. The polypeptides may correspond to a wild-type or a mutant 
PapD-P^K co-complex or FimC-FimH co-complex, which may optionally be further 
associated with one or more molecules. There are two types of heavy-atom derivatives of 



wo 01/10386 PCTAJSOO/22087 

46 

polypeptides: heavy-atom derivatives resulting fiom exposure of the proteins to a heavy metal 
in solution, wherein co-crystals are grown in medium comprising the heavy metal, or in 
crystalline form, wherein the heavy metal difiuses into the co-crystal, and heavy-atom 
derivatives wherem at least one of die polypeptides in the co-complex comprises heavy-atom 
containing amino acids, e.g., selenomethionine and/or selenocysteine mutants. 

In practice, heavy-atom derivatives of the first type can be foimed by soaking a native 
co-crystal in a solution comprising heavy metal atom salts, or organometallic compounds, 
e.g., lead chloride, gold thiomalate, thimerosal, uranyl acetate, platinum tetrachloride, 
osmium tetraoxide, zinc sulfate, and cobalt hexamine, which can diffiise through the co- 
crystal and bind to the crystalline polypeptides. 

Heavy-atom derivatives of this type can also be formed by adding to a crystallization 
solution comprising the polypeptides to be co-crystallized an amount of a heavy metal atom 
salt, which may associate with at least one of the protein and be incorporated into the co- 
crystal. The location(s) of the bound heavy metal atom(s) can be determined by X-ray 
difi&action analysis of the co-crystaL This information, in turn, is used to generate the phase 
information needed to construct the three-dimensional stmcture of the proteins in the co- 
conq)lex. 

The native and/or heavy-atom derivative co-crystals &om which the atomic structure 
coordinates of the invention are obtained can be obtained by conventional means as are well- 
known inthe art of protein crystallography, including batch, liquid bridge, dialysis, and vapor 
diffusion methods (see, e.g., McPherson, 1982, Preparation and Analysis of Protein Crystals, 
John Wiley, New York; McPherson, 1990, Eur. J. Biochem. 189:1-23.; Weber, 1991, Adv. 
Protein Chem. 41:1-36.). GeneraUy, native co-crystals are grown by dissolving substantially 
pure polypeptide encoding for die PapD-PapK co-complex or the FimH-FimC co-complex m 
an aqueous buffer containing a precipitant at a concentration just below that necessary to 
precipitate the protein. Examples of precipitants include, but are not limited to, polyethylene 
glycol, ammonium sulfete, 2-methyl-2,4-pentanediol. sodium citrate, sodium chloride, 
glycerol, isopropanol, lithium sulfete, sodium acetate, sodium formate, potassium sodium 
tartrate, ethanol, hexanediol, ethylene glycol, dioxane, t-butanol and combinations thereof 
Water is removed by controlled evaporation to produce precipitating conditions, which are 
maintained until co-crystal growth ceases. 
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In a preferred embodiment, native co-crystals are grown by vapor diffusion in hanging 
drops (McPherson, 1982, Preparation and Analysis of Protein Crystals, John Wiley, New 
York; McPherson, 1990, Eur. J. Biochem. 189:1-23.). In this method, the 
polypeptide/precipitant solution is allowed to equilibrate in a closed container with a larger 
aqueous reservoir having a precipitant concentration optimal for producing ciystals. 
Generally, less than about 25 \xL of substantially pure polypeptide solution is mixed with an 
equal volume of reservoir solution, giving a precipitant concentration about half that required 
for crystallization. This solution is suspended as a droplet underneath a coverslip, which is 
sealed onto the top of the reservoir. The sealed container is allowed to stand, usually for 
about 2-6 weeks, until co-crystals grow. 

Heavy-atom derivative co-crj^tals can be obtained by soaking native co-crystals in 
mother liquor containing salts of heavy metal atoms. Further, heavy-atom derivative co- 
crystals can also be obtained from SeMet and/or SeCys mutants, as described above for 
native co-crystals. 

Mutant proteins may crystallize under slightly different crystallization conditions flian 
wUd-type protein, or under very diflFerent crystallization conditions, depending on the nature 
of the mutation, and its location in the protein. For example, a non-conservative mutation 
may result in alteration of the hydrophilicity of the mutant, which may m turn make the 
mutant protein either more soluble or less soluble than the wild-type proteirL Typically, if a 
protein becomes more hydrophilic as a result of a mutation, it will be more soluble than the 
wild-type protein in an aqueous solution and a higher precipitant concentration will be needed 
to cause it to crystallize. Conversely, if a protein becomes less hydrophilic as a result of a 
mutation, it will be less soluble in an aqueous solution and a lower precipitant concentration 
will be needed to cause it to crystallize. If the mutation happens to be in a region of the 
protein involved in crystal lattice contacts, crystallization conditions may be afifected in more 
unpredictable ways. - 

The dimensions of a unit cell of a crystal are defined by six numbers, the lengths of 
three unique edges, a, b, and c, and three unique angles, a, 3. and y. The type of unit cell, that 
comprises a crystal is dependent on the values of these variables. In one embodiment, the co- 
crystal of the PapD-PapKpiliis chaperone-subunit co-complex has the space group of P2,2i2| 
with unit cell dimensions of a 62.1 ± 0.2 angstroms, b = 63.6 ± 0.2 angstroms and c 92.7 
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± 0.2 angstroms such that the three dimensional structure of the crystallized co-complex can 
be determined to a resolution of from about 3 angstroms to about 2.4 angstroms or greater. In 
another embodiment, the co-crystals of the FimC-FimH chaperone-adhesin co-complex has 
the space group P4,2,2 of P43 with unit cell dimensions of a=b= 97.7 ± 0.2 angstroms and c = 
215.9 ± 0.2 angstroms such that the three-dimensional structure of the co-complex can be 
determined to a resolution of from about 3 angstroms to about 2.5 angstroms or greater. 

When a crystal is plared in an X-ray beam, the incident X-rays interact with the 
electron cloud of the molecules that make up the crystal, resulting in X-ray scatter. The 
combination of X-ray scatter with the lattice of the crystal gives rise to nonuniformity of the 
scatter, areas of high intensity are caUed diffracted X-rays. The angle at which dif&acted 
beams emerge from the crystal can be computed by treating diffiaction as if it were reflection 
from sets of equivalent, parallel planes of atoms in a crystal (Bragg's Law). The most 
obvious sets of planes m a crystal lattice are those that are parallel to the &ces of the unit cell. 
These and other sets of planes can be drawn through the lattice points. Each set of planes is 
identified by three indices, hkl. The h index gives the number of parts into which the a edge 
of the unit ceU is cut, the k index gives the number of parts into which the b edge of the unit 
cell is cut, and the 1 index gives the number of parts into which the c edge of the unit cell is 
cut by the set of hkl planes. Thus, for example, the 235 planes cut the a edge of each unit ceU 
into halves, the b edge of each unit cell into thirds, and the c edge of each unit cell into fifths. 
Planes that are parallel to the be fece of the unit cell are the 100 planes; planes that are 
parallel to the ac face of the unit ceU are the 010 planes; and planes that are parallel to the ab 
face of the unit cell are the 001 planes. 

When a detector is placed in the path of the dif&acted X-rays, in effect cutting into the 
sphere of dif&action, a saies of ^ts, or reflections, are recorded to produce a "still" 
diffraction pattern. Each reflection is tiie result of X-rays reflecting off one set of parallel 
planes, and is characterized by an intensity, which is related to the distribution of molecules 
in the unit cell, and hkl indices, which correspond to the parallel planes from which the beam 
producing that spot was reflected. If the crystal is rotated about an axis perpendicular to the 
X-ray beam, a large number of reflections is recorded on the detector, resulting in a 
diffraction pattern. 

The unit cell dimensions and space group of a crystal can be detemiined from its 
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difl&action pattern. First, the spacing of reflections is inversely proportional to the lengths of 
the edges of the unit cell. Therefore, if a dififtaction pattern is recorded when the X-ray beam 
is perpendicular to a face of the unit cell, two of the unit cell dimensions may be deduced 
fiom the spacing of the reflections in the x and y directions of the detector, the crystal-to- 
detector distance, and the wavelength of the X-rays. Those of skill in the art will appreciate 
that, in order to obtain all three unit cell dimensions, the crystal must be rotated such that the 
X-ray beam is perpendicular to another face of the unit cell. Second, the angles of a unit ceU 
can be detennined by the angles between lines of spots on the dif&action pattern. Third, the 
absence of certain reflections and the repetitive nature of the diffraction pattern, which may 
be evident by visual inspection, indicate the internal synraietiy, or space group, of the crystal. 
Therefore, a crystal may be characterized by its unit cell and space group, as well as by its 
diffraction pattern. 

Once tiie dhnensions of the unit ceU are determined, the likely number of polypeptides 
in the asymmetric unit can be deduced firam the size of the polypeptide, the density of the 
average protein, and the typical solvent content of a protein crystal, which is usually in die 
range of 30-70% of the unit cell volume. 

The diffraction pattern is related to the three-dimensional shape of the molecule by a 
Fourier transform. The process of determining the solution is in essence a re-focusing of the 
diffracted X-rays to produce a three-dimensional image of the molecule in the crystal. Since 
re-focusing of X-rays cannot be done with a lens at this time, it is done via mathematical 
operations. 

The sphere of diffraction has symmetry that depends on the internal symmetry of the 
crystal, which means that certain orientations of Ae crystal will produce the same set of 
reflections. Thus, a crystal with high symmetey has a more repetitive diffraction pattern, and 
there are fewer unique reflections that need to be recorded in order to have a complete 
representation of the diffraction. The goal of data collection, a dataset. is a set of consistently 
measured, indexed intensities for as many reflections as possible. A complete dataset is 
coUected if at least 80%, preferably at least 90%, most preferably at least 95% of unique 
reflections are recorded. In one embodiment, a complete dataset is collected using one 
crystal. In another embodiment, a complete dataset is collected using more than one crystal 
of the same type. 
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Sources of X-rays include, but are not limited to, a rotating anode X-ray generator 
such as a Rigaku RU-200 or a beamline at a synchrotron light source, such as the Advanced 
Photon Source at Argonne National Laboratory. Suitable detectors for recording diffiaction 
patterns include, but are not limited to. X-ray sensitive film, multiwire area detectors, image 
plates coated with phosphorus, and CCD cameras. Typically, the detector and the X-ray 
beam remain stationary, so that, in order to record diffraction from different parts of the 
crystal's sphere of diffraction, the crystal itself is moved via an automated system of 
moveable circles called a goniostat. 

One of the biggest problems in data coUection, particularly from macromolecular 
crystals having a high solvent content, is tiie rapid degradation of the crystal in the X-ray 
beam- In order to slow the degradation, data is often collected from a crystal at liquid 
nitrogen temperatures. In order for a crystal to survive the initial exposure to liquid nitrogen, 
the formation of ice within the crystal must be prevented by the use of a cryoprotectant. 
Suitable cryoprotectants include, but are not limited to, low molecular weight polyethylene 
glycols, ethylene glycol, sucrose, glycerol, xylitol, and combinations thereof. Crystals may 
be soaked in a solution comprising the one or more cryoprotectants prior to exposure to liquid 
nitrogen, or the one or more cryoprotectants may be added to the crystallization solution. 
Data coUection at liquid nitrogen temperatures may allow the collection of an entire dataset 
from one crystal. 

Once a dataset is collected, the information is used to determine the three-dimensional 
structure of the molecule in the crystal. However, Ods cannot be done from a single 

measurement of reflection intensities because certain information, known as phase 
information, is lost between the three-dimensional shape of the molecule and its Fourier 
transform, the diffraction pattern. This phase information must be acquired by methods 
described below in order to perform a Fourier transfonn on the diffraction pattern to obtain 
the three-dimensional structtire of the molecule in the crystal. It is the determination of phase 
information that in effect refocuses X-rays to produce the image of the molecule. 

One method of obtaining phase information is by isomorphous replacement, in which 
heavy-atom derivative crystals are used. In this method, the positions of heavy atoms bound 
to the molecules in the heavy-atom derivative crystal are determined, and this information is 
then used to obtain the phase information necessary to elucidate the three-dimensional 
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Structure of a native ciystal. (Blundel et aL, 1976, Protein CrystaUography. Academic Press). 

Another method of obtaining phase information is by molecular replacement, which is 
a method of calculating initial phases for a new crystal of a polypeptide or polypeptide co- 
complex whose structure coordinates are unknoAvn by orienting and positioning a polypeptide 
whose structure coordinates are known within the unit cell of the new crystal so as to best 
account for the observed difi&action pattern of the new crystal. Phases are then calculated 
from the oriented and positioned polypeptide and combined with observed ampUtudes to 
provide an ^proximate Fourier synthesis of the structure of the molecules comprising the 
new crystal. (Lattman, 1985, Methods in Enzymology 1 15:55-77; Rossmann, 1972. The 
Molecular Replacement Method." Int. Sci. Rev. Scr. No. 13, Gonion & Breach. New Yoric). 

A third method of phase determination is multi-wavelength anomalous dispersion or 
MAD. In this method. X-ray diffraction data are coUected at several different wavelengths 
from a single crystal containing at least one heavy atom with absorption edges near the 
energy of incoming X-ray radiation. The resonance between X-rays and electron orbitals 
leads to differences m X-ray scattering that permits the locations of the heavy atoms to be 
identified, which in turn provides phase information for a crystal of a polypeptide, A detailed 
discussion of MAD analysis can be found in Hendrickson, 1 985, Trans. Am, Crystallogr, 
Assoc.. 21:11; Hendrickson et al, 1990. EMBO J, 9:1665; and Hendrickson. 1991. Science 
4:91- 

A foiffth method of determining phase information is single wavelength anomalous 
dispersion or SAD. In this technique. X-ray diffraction data are collected at a single 
wavelength from a single native or heavy-atom derivative crystal, and phase information is 
extracted using anomalous scattering information from atoms such as sulfur or chlorine in the 
native crystal or from the heavy atoms in the heavy-atom derivative crystal. The wavelength 
of X-rays used to collect data for this phasing technique need not be close to the absorption 
edge of the anomalous scatterer. A detailed discussion of SAD analysis can be found in 
Brodersen et al.. 2000, Acta Cryst, D56:43 1-441. 

A fifth method of determining phase information is single isomorphous replacement 
with anomalous scattering or SIRAS. This technique combines isomorphous replacement 
and anomalous scattering techniques to provide phase information for a crystal of a 
polypeptide. X-ray diffraction data are collected at a single wavelength, usually from a single 
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heavy-atom derivative crystal. Phase information obtained only fiom the location of the 
heavy atoms in a single heavy-atom derivative crystal leads to an ambiguity in the phase 
angle, which is resolved using anomalous scattering fiom the heavy atoms. Phase 
information is therefore extracted from bpth the location of the heavy atoms and from 
anomalous scattering of the heavy atoms. A detailed discussion of SIRAS analysis can be 
fomid in North. 1965, Acta Cryst. 18:212-216; Matthews, 1966. Acta Cryst. 20:82-86. 

Once phase information is obtained, it is combined with the dif&action data to 
produce an electron density map. an image of the electron clouds that surround the molecules 
in the unit ceU. The higher the resolution of the data, the more distinguishable are the 
features of the electron density map, e.g., amino acid side chains and the positionsof 
carbonyl oxygen atoms in the peptide backbones, because atoms that are closer together are 
resolvable. A model of the macromolecule is then built into the electron density map with the 
aid of a computer, using as a guide all available information, such as the polypeptide 
sequence and the established rules of molecular structure and stereochemistry. Interpreting 
the electron density map is a process of finding the chemicaUy realistic conformation that fits 
the map precisely. 

After a model is generated, a structure is refined. Refinement is the prpcess of 
minimizing the function «, which is the difference between observed and calculated intensity 
values (measured by an R-fector), and which is a fimction of the position, temperature fector, 
and occupancy of each non-hydrogen atom in the model. This usually involves alternate 
cycles of real space refinement, i.e., calculation of electron density maps and model building, 
and reciprocal space refinement, /.e., computational attempts to improve the agreement 
between the original intensity data and intensity data generated fiom each successive modeL 
Refinement ends when the function « converges on a minimum wherein the model fits the 
electron density m^ and is stereochemically and conformationaUy reasonable. During 
refinement, ordered solvent molecules are added to the structure. 

The atomic structure coonlinates and machine readable media of the invention have a 
variety of uses. The present invention encompasses the structure coordinates and other 
information. e.g., amino acid sequence, connectivity tables, vector-based representations, 
temperature factors, etc., used to generate the three-dimensional structures of the polypeptides 
for use in the software programs described below and other software programs. For example. 
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the coordinates are useful for solving the three-dimensional X-ray diffraction and/or solution 
structures of other proteins, including mutant PapD-PapK ch^erone-subunit or FimC-FimH 
chaperone-adhesin co-complexes, PapD-PapK chaperone-subunit co-complexes or FimC- 
FimH chaperone-adhesin co-complexes that are further associated with other molecules, and 
unrelated proteins, to high resolution. Stractural information may also be used in a variety of 
molecular modeling and computer-based screening applications to, for example, intelligently 
design mutants of the crystallized PapD-PapK chaperone-suburiit co-complex or the 
crystallized FimC-FimH chaperone-adhesin co-complex that have altered biological activity 
and to computationally design and identify compounds that bind the G, beta-strand of a 
periplasmic chaperone, the amino-terminal end of a pilus subunit Such compounds may be 
used as lead compounds in pharmaceutical efforts to identify compounds that inhibit pilus 
biogenesis as a therapeutic sppro^ch toward the treatment of several types of disease caused 
by pathogenic Gram-negative bacteria such as Escherichia coli. Haemophilus influenzae. 
Salmonella enteriditis. Salmonella typhimurium, Bordetella pertussis. Yersinia enterocblitica. 
Yersinia perstis, Helicobacter pylori and Klebsiella pneumoniae. 

In a fiirdier aspect of die invention, such potential antibacterial con^sounds are 
evaluated for their capacity to prevent or treat a bacterial infection. These methods comprise 
designing and synthesizing candidate antibacterial compoimds using the atomic coordinates 
of the three dimensional structure of such co-crystals and screened for its ability to bind to 
pilus subunits thereby inhibiting or preventing pilis biogenesis. The antibacterial activity of 
the compound is determined by assaying the bacterium for infectivity or monitoring the pilus 
for activity. Such compounds which are able to prevent or inhibit pilus biogenesis or the 
ability of the bacterial pilus to infect a host tissue can be used in the pharmaceutical 
compositions of the present inventioiL 

Additionally, the invention encompasses machine readable media embedded with the 
three-dimensional stmctures of the models described herein, or with portions thereof. As 
used herein, ^machine readable medium** refers to any medium that can be read and accessed 
directly by a computer or scaimer. Such media include, but are not limited to: magnetic 
storage media, such as floppy discs, hard disc storage medium and magnetic tape; optical 
storage media such as optical discs or CD-ROM; electrical storage media such as RAM or 
ROM; and hybrids of these categories such as magnetic/optical storage media. Such media 
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further include paper on which is recorded a representation of the atomic structure 
coordinates. e.g., Cartesian coordinates, that can be read by a scanning device and converted 
into a three-dimensional structure with an Optical Character Recognition (OCR). 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon the atomic structure coordinates of the 
invention or portions thereof and/or X-ray difi&action data. The choice of the data storage 
stmcture will generally be based on the means chosen to.access the stored infomiation. In 
addition, a variety of data processor programs and fonnats can be used to store the sequence 
and X-ray data information on a computer readable medium. Such formats include, but are 
not limited to. Protein Data Bank ("PDB") format (Research Collaboratory for Structural 

Bioinformatics;http://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_fi^^ 
Cambridge Crystallogr^hic Data Centre format 

(http://www.ccdc.cam.ac.uk/support/csd_docA^olume3/z323.html); Structure-data ("SD") file 
format (MDL Information Systems. Inc.; Dalby et oL, 1992. J. Chem. Inf. Comp. Sci. 32:244- 
255). and line-notation. e.g., as used in SMILES (Weininger. 1988. J. Chem. Inf. Comp. Sci. 
28:31-36). Methods of converting between various fonnats read by different computer 
software will be readily apparent to those of skill in the art, e.g., BABEL (v. 1 .06, Walters & 
Stahl.©1992, 1993, 1994; http://www.bruneI.ac.uk/departments/chem/babeLhtm.') All 
format representations of the polypeptide coordinates described herein, or portions thereof, 
are contemplated by the present invention. By providing computer readable medium having 
stored thereon the atomic coordmates of the invention, one of skill in the art can routinely 
access the atomic coordinates of the invention, or portions thereof, and related information for 
use in modeling and design programs, described in detail below. 

While Cartesian coordinates are important and convenient representations of the 
three-dimensional stmcture of a polypeptide, those of skill in the art will readily recognize 
that other representations of the stmcture are also useful. Therefore, the thn»-dimensional 
strecttire of a polypeptide, as discussed herein, includes not only the Cartesian coordinate 
representation, but also all alternative representations of the three-dimensional distribution of 
atoms. For example, atomic coordinates may be represented as a Z-matrix, wherein a first 
atom of the protein is chosen, a second atom is placed at a defined distance from the first 
atom, a third atom is placed at a defined distance firom the second atom so that it makes a 
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defined angle with the first atom. Each subsequent atom is placed at a defined distance fi-om 
a previously placed atom with a specified angle with respect to the third atom, and at a 
specified torsion angle with respect to a fourth atom. Atomic coordinates may also be 
represented as a Patterson function, wherein all interatomic vectors are drawn and are then 
placed with their tails at the origin. This representation is particularly useful for locating 
heavy atoms in a unit cell. In addition, atomic coordinates may be represented as a series of 
vectors having magnitude and direction and drawn firom a chosen origin to each atom in the 
polypeptide stmcture. Furthemiore, the positions of atoms in a three-dimensional structure 
may be represmted as firactions of the unit cell (firactional coordinates), or in spherical polar 
coordinates. 

Additional information, such as themial parameters, which measure the motion of 
each atom in the structure, chain identifiers, which identify the particular chain of a multi* 
chain protein or protein co-complex in which an atom is located, and connectivity 
information, which indicates to which atoms a particular atom is bonded, is also usefid for 
representing a three-dimensional molecular stmcture. 

Uses of the Atomic Stmc ture Coordinates 

Stmcture information, typically in die form of the atomic structure coordinates, can be 
used in a variety of computational or computer-based methods to, for example, design, screen 
for and/or identify compounds that bind the crystallized polypeptide or a portion or firagment 
thereof; or to intelligently design mutants that have altered biological properties. 

In one embodiment, the co-crystals and stmcture coordinates obtained therefirom are 
useful for identifying and/or designing compounds that bmd PapD, P^K, FimC or FimH as 
an approach towards developing new therapeutic agents. For example, a high resolution 
X-ray stmcture will often show the locations of ordered solvent molecules around the protein, 
and in particular at or near putative binding sites on the protein. This information can then be 
used to design molecules that bind these sites, the compounds synthesized and tested for 
binding in biological assays. Travis, 1 993, Science 262 : 1 374. 

In another embodiment, the stmctures are probed with a plurality of molecules to 
deteraiine their ability to bind to PapD, Ps^K, FimC or FimH at various sites. Such 
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compounds can be used as targets or leads in medicinal chemistry efforts to identify, for 
example, inhibitors of potential therapeutic importance. 

In specific embodiments described herein, the high resolution X-ray structures of the 
PapD/PapK and FimC/FimH co-complexes show details of the interactions between PapD 
and PapBC, and between FimC and FimH, respectively. This information can be used to 
design molecules that bind to the sites of interaction, thereby blocking co-complex formation. 
In addition, the X-ray structure of the FimC/FimH co-complex has a C-HEGA molecule 
bound in the mannose-binding pocket of FimH, which can be used t; :,odeI compounds that 
bind to the lectin and inhibit the FimH interaction with mannose olig . :;accharides on host 
cells. 

In yet another embodiment, the structures can be used to con^utationally screen str^r^U 
molecule data bases for chemical entities or compounds that can bind in whole, or in par^ to 
PapD, P^K, FimC or FimH. In diis screening, the quality of fit of such entities or 
compounds to the binding site may be judged either by shape complementarity or by 
estimated interaction energy. Meng et al, 1992, J. Comp. Chem. 13:505-524. 

The design of compounds that bind to PapD, PapK, FimC or FimH according to this 
invention generally involves consideration of two factors. First, the compound must be 
capable of physically and structurally associating with PapD, PapK, FimC or FimH. This 
association can be covalent or non-covalent. For example, covalent intoactions may be 
important for designing suicide or irreversible inhibitors of a protein. Non-covalent 
molecular interactions important in the association ofVspD with P^K or of FimC with FimH 
include hydrogen bonding, ionic interactions and van der Waals and hydrophobic 
interactions. Second, Uie compound must be able to assume a conformation that allows it to 
associate with PapD. PapK, FimC or FimH. Although certain portions of the compound will 
not directly participate in this association with the protein, those portions may stiU influence 
the overall conformation of the molecule. This, in turn, may have a significant impact on 
potency. Such conformational requirements include the overall three-dimensional structure 
and orientation of the chemical group or compound in relation to all or a portion of the 
binding site, or the spacing between functional groups of a compound comprising sevoal 
chemical groups that directly interact with the protein. 

The potCTtial inhibitory or binding effect of a chemical compound on PapD, Papit, 
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FimC or FimH may be analyzed prior to its actual sjmthesis and testing by the use of 
computer modeling techniques. If the theoretical structure of the given compound suggests 
insufficient interaction and association between it and the protein, synthesis and testing of the 
compound is unnecessary. However, if computer modeling indicates a strong into^tion, the 
molecule may then be synthesized and tested for its ability to bind to the protein and inhibit 
its activity. In this manner, synthesis of ineffective compounds may be avoided. 

An inhibitory or other binding compound of PapD, PapK, FimC or FimH may be 
computationally evaluated and designed by means of a series of steps in which chemical 
groups or fragments are screened and selected for their ability to associate with the individual 
binding pockets or interface surfaces of each of the proteins. One skilled in the art may use 
one of several methods to screen chemical groups or fragments for their ability to associate 
with PapD, PapK, FimC or FimH. This process may begin by visual inspection ot for 
example, the protein/protein interfaces or the mannose-binding site of FimH on the computer 
screen based on the Ps^D/P^K or FimC/FimH co-complex coordinates. Selected fragments 
or chemical groups may then be positioned in a variety of orientations, or docked, at an 
individual surface of P^D, PapK, FimC or FimH that participates in a protein/protein 
interface in the co-complex, or in the mannose-binding pocket of FimH, as defined siqnra. 
Docking may be accomplished using sofbvare such as QUANTA and S YB YL, followed by 
energy minimization and molecular dynamics with standard molecular mechanics forcefields, 
such as CHARMM and AMBER. 

Specialized computer programs may also assist in the process of selecting fragments 
or chemical groups. These include: 

1. GRID (Goodford, 1985, J. Med. Chem. 28:849-857), GRID is available from 
Oxford University. Oxford, UK; 

2. MCSS (Miranker & Kaiplus, 1991, Proteins: Structure, Function and Genetics , 
1 1:29-34). MCSS is available from Molecular. Simulations, Bmrlington, MA; 

3. AUTODOCK (Goodsell & Olsen, 1990, Proteins: Stmcture, Function, and 
Genetics 8:195-202). AUTODOCK is available from Scripps Research Institute, La JoUa, 
CA;and 

4. DOCK (Kuntz et aL, 1982, J. Mol. Biol. 161 :269-288). DOCK is available 
from University of California, San Francisco, CA. 
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Once suitable chemical groups or fragments have been selected, they can be 
assembled into a single compound or inhibitor. Assembly may proceed by visual inspection 
of the relationship of the fragments to each other in the three-dimensional image displayed on 
a computer screen in relation to the structure coordinates of PapD, PapK, FimC or FimH. 
This would be foUowed by manual model building using software such as QUANTA or 
SYBYL. ' \ 

Useful programs to aid one of skill in the art in connecting the individual chemical 
groups or fragments include: 

1. CAVEAT (Bartlett et al., 1989, 'CAVEAT: A Program to Facilitate the 
Structure-Derived Design of Biologically Active Molecules'. In Molecular Recognition in 
Chemical and Biological Problems', Special Pub., Royal Chem. Soc. 78:182-196). CAVEAT 
is available from the University of California, Berkeley, CA; 

2. 3D Database systems such as MACCS-3D(MDLInfonnation Systems, San 
Leandro, Calif.). This area is reviewed in Martin, 1992. J. Med. Chem. 35:2l45-i2154); and 

3 . HOOK (available from Molecular Simulations, Burlington, Mass.). 
Instead of proceeding to build an inhibitor of PapD/PapK or FimC/FimH co-complex 

formation, or of mannose binding to FimH, in a step-wise fashion one fragment or chemical 
groiq) at a time, as described above. PapD-, PapK-, FimC- or FimH-binding compounds may 
be designed as a whole or 'de novo' using either an empty binding site or the surfece of a 

protein that participates in protein/protein interactions m a co-complex, or optionally 
including some portion(s) of a known inhibitor(s) or of the second protein in the co-complex 
that participates in a particular protein/protein interaction at an interfece. These methods 
include: 

1. LUDI (Bohm, 1992, J. Comp. Aid. Molec. Design 6:61-78). LUDI is available 
from Molecular Simulations, Inc., San Diego, CA; 

2. LEGEND (Nishibata & Itai, 1991 , Tetrahedron 47:8985). LEGEND is 
available from Molecular Simulations, Burlington, Mass.; and 

3. LeapFrog (available from Tripos, Inc., St. Louis, Mo.). 

Other molecular modeling techniques may also be employed in accordance with this 
invention. See, eg., Cohen et aL, 1990, J. Med. Chem. 33:883-894. &e also, Navia & 
Murcko, 1992, Current Opinions m Structural Biology 2:202-210. 
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Once a compound has been designed or selected by the above methods, the efficiency 
with which that compound may bind to PapD, PapK, FimC or FimH may be tested and 
optimized by computational evaluation. For example, a compound that has been designed or 
selected to fiinction as a FimH mannose-binding inhibitor must also preferably occupy a 
volume not overlapping the volume occupied by the mannose-binding site residues when 
tnannose is bound. An efifective inhibitor of PapD/PapK or FimC/FimH co-complex 
formation, or of FimH mannose binding must preferably demonstrate a relatively smaU 
difference in energy between its bound and free states (/.e.. it must have a small defoimation 
energy of binding). Thus, the most efficient inhibitors should preferably be designed with a 
deformation energy of binding of not greater than about 10 kcal/mol. preferably, not greater 
than 7 kcal/mol. Inhibitors may int«act with the protein in more than one conformation that 
is similar in overaU binding energy. In those cases, the deformation energy of binding is 
taken to be the difference between the energy of the free compound and the average energy of 
the conformations obsenred when the inhibitor binds to the protein. 

A compound selected or designed for binding to PapD, P^K, FimC or FimH may be 
further computationaUy optimized so that in its bound state it would preferably lack repulsive 
electrostatic interaction with the target protein. Such non-complementary electrostatic 
interactions include repulsive chargeKshaige, dqiole-dipole and charge-dipole interactions. 
Specifically, the sum of all electrostatic interactions between the inhibitor and the protein 
when the inhibitor is bound to it preferably make a neutral or fevorable contribution to the 
enthalpy of binding. 

Specific computer software is available in the art to evaluate compound deformation 
energy and electrostatic interaction- Examples of programs designed for such uses include: 
Gaussian 92. revision C (Frisch. Gaussian, Inc., Pittsburgh. PA. ©1992); AMBER, version 
4.0 (Kolhnan, University of California at San Francisco, ©1994); QUANTA/CHARMM 
(Molecular Simulations, Inc., Burlington, MA. ©1994); and Insight n/Discover (Biosym 
Technologies Inc.. San Diego, CA, ©1994). These programs may be unplemented, for 
instance, using a computer workstation, as are well-known in the art. Other hardware systems 
and software packages will be known to those skilled in the art. 

Once a PapD-, PapK-. FimC- or FimH-binding compound has been optimally selected 
or designed, as described above, substitutions may then be made in some of its atoms or 
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chemical groups in order to improve or modify its binding properties. GeneraUy. initial 
substitutions are conservative. Le., the replacement group will have approximately the same 
size, shape, hydrophobicity and chaige as the original group. One of sIdU in the art will 
understand that substitutions known in the art to alter confonnation should be avoided. Such 
altered chemical confounds may then be analyzed for efficiency of binding to P^D, PapK, 
FimC or FimH by the same computer methods described in detail above. 

Because P^D/P^K co-complexes may ctystallize in niore than one crystal foim, the 
structure coordinates of PapD/PapK co-complex, of PapD alone, of PapK alone, or of 
portions thereof, are particularly useful to solve the structure of those other co-crystal forms 
of P^D/PapK co-complex. They may also be used to solve the structure of mutants, of 
P^D/PapK co-complex further complexed to another molecule, or of the crystalline form of 
any other protein or protein co-complex with significant amino acid sequence homology to 
any functional domain of PapD or PapK. Similarly, the structure coordinates of FimC/FimH 
co-complex, of FimC alone, of FimH alone, or of portions thereof are particularly useful to 
solve the structure of other co-crystal forms of FimC/FimH co-complex. They may also be 
used to solve the structure of mutants, of FimC/FimH co-complex further complexed to 
another molecule, or of the crystalline form of any other protein or protein co-complex with . 
significant amino acid sequence homology to any functional domain of FimC or FimH. 

One method that may be employed for this purpose is molecular replacement In this 
method, the unknown co-crystal structure, whether it is another co-crystal form of a 
PapD/PapK or FimC/FimH co-complex, a mutant, a PapD/PapK or FimC/FimH co-complex 
that is further complexed to another molecule, or the crystal of some other protein or protein 
co-complex with significant amino acid sequence homology to any functional domain of one 
.of the proteins in the co-complex crystal, may be determined using phase information from 
the PapD/PapK or FimC/FimH stracture coordinates, respectively. This method will provide 
an accurate three-dimensional structure for the unknown protein or protein co-complex in the 
new crystal more quickly and efHcientiy than attempting to determine such information ab 
initio. 

If an unknown crystal form has the same space group as and similar cell dimensions 
to the known co-complex crystal form, then the phases derived from the known crystal form 
can be directly applied to the unknown crystal form, and in turn, an electron density map for 
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the unknown crystal fonn can be calculated. Difference electron density maps can then be 
used to examine the differences between the unknown crystal fonn and the known crystal 
form. A difference electron density map is a subtraction of one electron density map, e.g., 
that derived from the known crystal form, from another electron density map, e.g., that 
derived fipom the unknown crystal form. Therefore, all similar features of the two electron 
density maps are eliminated in the subtraction and only the differences between the tWo 
stmctures remain. For example, if the unknown ciystal form is of a FimC/FimH co-complex 
that is further complexed with a mannose analog in the FimH mannose binding site, then a 
difference electron density m^ between this map and the map derived from the native, 
uncomplex«l crystal will ideaUy show only the electron density of the differences between C- 
HEGA and the mannose analog. Similariy, if amino acid side chains have different 
conformations in the two ciystal forms, then those differences will be highUghted by peaks 
(positive electron density) and valleys (negative electron density) in the difference electron 
density map, making the differences between the two crystal forms easy to detect However, 
if the space groups and/or ceU dimensions of the two crystal forms axe different, then this 
approach will not work and molecular replacement must be used in order to derive phases for 
the unknown crystal form. 

All of the complexes referred to above may be studied using well-known X-ray 
diffiaction techniques and may be refined versus IJ A or higher to 3 A resolution X-ray date 
to an R value of about 0.20 or less using computer software, such as X-PLOR (Yale 
University, (c) 1992. distributed by Molecular Simulations, Inc.). See, e.^., Blundel et al., 
1976, Protein CrystaUography, Acadenuc Press.; Methods in Enzymologv. vol. 1 14 & 1 15, 
Wyckoffrt al., eds.. Academic Press, 1985. This information may thus be used to optimize 
known classes of inhibitors of PapD/PapK or FimC/FimH co-complex formation or of 
mannose binding to FimH. and more importantly, to design and synthesize novel classes of 
inhibitors of PapD/PapK or FimC/FimH co-complex fonmation or of mannose binding to 
FimH. 

The strucftire coordinates of PapD/PapK or FimC/FimH mutant co-complexes will 
also facilitate the identification of related protein co-complexes analogous to the P^D/PapK 
or FimC/FimH co-complexes in function, structure or both, thereby further leading to novel 
therapeutic modes for treating or preventing gram-negative bacteria-mediated diseases. 
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Subsets of the atomic structure coordinates can be used in any of the above methods 
Particularly useful subsets of the coordinates include, but are not limited to. coordinates of 
single domains, coordinates of residues lining an active site, coordinates of residues that 
participate in important protein-protein contacts at an interface, and C« coordinates. For 
example, the coordinates of one domain of a protein that contains the active site may be used 
to design inhibitors that bind to that site, even though the protein is fiilly described by a larger 
set of atomic coordinates. Therefore, as described in detail for the specific embodiments, 
below, a set of atomic coordinates that define the entire polypeptide chain, although usefbl for 
many appUcations, do not necessarily need to be used for the methods described herein. 

Vse^ <?f Wb??ffti? of atomic coordinates in «p y ;ific emh«^iTy |i^tc 

The structure cooniinates of the present mvention. and subsets thereof are useful for 
designing or screening for compomuis that bind to the PapD. PapK. FimC or FimH proteins. 
The high resolution X-ray structures of the PapD/PapK and FimC/FimH co^jomplexes of the 
present invention show details of the interactions between PapD and PapK, and between 
FimC and FimH, respectively. Hus information can be used to design and/or screen for 
compounds that bind to the sites of interaction, thereby blocking co-complex formation and 
pilus assembly. In addition, the X-ray structure of the FimC/FimH co-compl«c has a C- 
HEGA molecule bound in the mannose-binding pocket of FimH. which can be used to model 
compounds that bind to the lectin domain and inhibit the FimH interaction with mamiose on 
host cells. 

Those of skiU in the art will recognize that the conq)lete set of PapD/PapK co- 
complex structure cooniinates and the complete set of FunOFimH co-complex stmcture 
coordinates will be useful in the methods of the present mvention. Those of sldU in the art 
will finther recognize that the coordinates of PapD. PapK. FimC and FimH will be useful 
separate fiom the cooniinates of the protein with which each protein forms a co-complex in 
thecrystals. In addition, those ofskill in the art will recognize that subsets of the sttuctwe 
coordinates of each protein, such as the coordinates of a single domain or interface or binding 
pocket, will be useful in the methods of the invention, as discussed in more detail, below. 

In one embodiment, the PapK coordinates, or the subset of PapK coordinates that are 
the residues in the hydrophobic groove region of PapK (the Kl region), where the G, beta- 
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Strand of PapD interacts with PapK in the co-complex crystal structure, are useful for 
designing and/or screening for compounds that bind in the groove in order to prevent pUus 
assembly. A subset of structure coordinates of PapK usefiU in this embodiment of the 
invention include those of Val•«Leu"^Va^^«Phe'«Phe^,^e■««' Phe*« He'"' Ile'^-^ 
Tyr-Ala'-TV-Phe'-Leu-andTyr'-asnumberedinF^ ' ' ' 

In another embodiment, the PapD coordinates, or the subset of PapD coordinates that 
are the G. beta-strand residues (the Dl region), which interacts with the Kl region by fitting 
mto the hydrophobic groove of PapK in the PapD/PapK co-complex structure, are useful for 
designing compounds that have an analogous shape, such that the compounds fit into the 
PapK groove and inhibit pilus assembly. AsubsetofG, beta-strand structure coordinates of 
PapD useful in this embodunent include those of Leu'"". Gfa""". ne"»° Ala'"" and Leu"™ 
In yet another embodiment, the PapD coordinates, or a subset of PapD coordinates in 
the D2 region, and the PapK coordinates, or a subset of PapK coordinates in &e K2 region, 
which participate in a second interfece of the PapD/PapK co-complex, are usefiil for 
designing and/or screening for compounds that disrupt this interaction and prevent PapD- 
PapK coH^mplex fijnnation. A subset of PapK coordinates useful for this embodiment of the 
mvention include those of residues Val««, Gly^, Lys*"' and Arg"'''. A subset of PapD 
coordinates usefiil for thisembodiment of the invention include those of residues Thr'«" 
Ile'«". Glu'""*. Glu'«", Thr'™'^. He"*" and Axg^. 

In another embodiment, the FimH coordinates, a subset of the FimH coordinates that 
are the pilin domain of FimH. or a subset of FimH coordinates that are the residues in the 
hydrophobic groove region of the pilin domain, where the G, beta-strand of FimC interacts 
with FimH. are usefiU for designing and/or screening for compounds that inhibit this 
interaction, thereby inhibiting pilus formation in type 1 piU. A subset of FimH stnicture 
coordinates usefiil in this embodiment of the invention include those of residues Ala'»« 
Asn«» Val'«« and Val'«« as nmnbered in Fig.'8. 

In yet another embodiment, the FimC coordinates, or a subset of FimC coordinates 
that are the residues of the G, beta-strand that interact with the hydrophobic groove region of 
FimH are usefiil for designing compounds that have an analogous shape, such that the 
compounds fit into the FimH groove and inhibit type 1 pilus assembly. A subset of FimC 
structure coordinates usefiil in this embodiment of the invention include those of residues 
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ne'»«,Leu«»«=andne""^. 

m another embodiment, the FimH coordinates, a subset of FimH coordinates that are 
the lectm domain of FimH. or a subset of FimH coordinates that comprise the mamiose 
binding pocket of the lectin domain axe useful for designing and/or screening for confounds 
that fit mto the mannose binding pocket and block the interaction of FimH with host ceU 
mamioseoligosaccharides,thuspieventingadhesiontohostceUsand^. co// pathogeiiesis A 
subset of strucmre coordinates useful in flus embodiment of the invention include those of 
residues Phe'«. Asn«", Asp^™. Tyi^«. ne«". Asp««. Ghi»« Asn'»«. Tyr"^™ Asn"«« Asn-" 
andPhe'^. 



The following examples illustrate the mvemion. but are not to be taken as limiting the 
vanous aspects of the invention so iUustiated. 



EXAMPLES 

Example 1 ; The PapD-PapK rhaneroni^^nhnni. r^^ ^r ^^ ^- 
Expression of the PapD-PapKCo^mplex. The PapD-PapK co-complex was 
overexpiessed in E.coli and periplasms were prepared as described by Slonim et al. (ZMBO J. 
1992. 11:4747). Periplasms were then subjected to cation exchange (15S Source 
(Pharmacia)) foUowed by hydrophobic interaction (15PHE Source (Pharmacia)) 
chromatography to yield pure co-complex. Expression of selenome&ionine (Se-Met) P^D- 
Pq)K co^mplexes was carried out in the E.coli methionine-auxotroph DL41 strain as 
described by Hendrickson et al. {EMBOJ. 1990, 9:1665) and purified as was the wild-type 
coK:omplex. The purified wild-type or Se-Met PapD-PapK co-compiexes were dialyzed 
against 20 mM KMES pH 6.7 and concentrated to -12 mg/ml. Co-crystals were grown by 
v^r difiusion using the hanging drop method against a reservoir containing 10-15% (w/v) 
PEG 6000. 100 mM potassium-acetate, and 200-400 mM sodium acetate at pH 4.6 [A. 
McPherson,^«r.j: Biochem. 1 89. 23 (1990)] and appeared within feree to five days. Ihe 
co-crystals were cryoprotected by mcreasing the concentration of PEG 6000 to 25% (w/v) 
and flash-cooled to liquid nitrogen temperature. Co-crystals were in space group P2,2,2„ 
with cell dimensions a = 62.12 i 0.2 A. b - 63.69 ± 0.2 A. and c = 92.72 ± 0.2 A. and Ivi'ih 
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one co-complex in the asymmetric unit Table 4 contains a summary of the data coUected and 
refinanent statistics. 
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A complete data set to a resolution of 2.7 A was collected in the laboratoiy setting 
(Rigaku Raxis IV image plate mounted on a Rigaku RU200 rotating anode X-ray generator) 
using an oscillation range of 1 .5* and exposure time of 45 mm/frame C*Native" data set in 
Table 4). Se-Met PapD-PapK co-crystals were in the same space group with the same cell 
dimensions. Once cooled, these co-ciystals dif&acted to slightly higher resolution in the 
laboratory setting and a complete data set C'Se-Met Single'* in Table 4) to a resolution of 2.5 
A was collected (2.5E osciUation range, 60 mm/frame). These co-crystals were also used to 
coUect MAD data at thfc National Synchrotron Ught Source at Brookhavcn National 
Laboratory (Beamline X4A). Complete data sets at four wavelengths to a resolution of 2.4 A 
were collected rSe-Metl-4" in Table 4). All data were reduced and processed using the 
programs DENZO and SCALEPACK [Z. Otwinoski. in Proceedings of the CCP4 Study 
Weekend, L. Sawyers, N. Isaacs, S. Bailey, Eds. (SERC Daresbury Laboratory, 
Warrington, 1993), pp. 56-62]. 

Structure of PapD-PapK co-complex. The structure of the PapD-P^K co-complex 
was solved using MAD phasing [W. A. Hendrickson. Science 254, 51 (199.1)]. The P^D- 
PapK co-complex contains three methionines, aU of which are in P^D, at positions 18, 66, 
and 172. The /Native" and "Se-Met Single" data sets were first used to gmerate a difference 
Patterson map using the program HEAVY [T. C. Terwilliger and D. Eisenberg, Acta 
Crystalldgr. A39, 813 (1983)] where strong peaks could be readily located. Three heavy 
metal positions were detennined using the program HASSP [T. C. TerwilUger, S.-H. Kim, D. 
Eisenberg, Acta Crystallogr. A43, 1 (1987)]. Initial SIRAS-soIvent flattened phases were, 
however, insufficient to build a model of PapK. Subsequently, multi-wavelength anomalous 
difi&action (MAD) data were collected (Table 4). After local scaling using the high eneigy 
remote wavelength ("SeMet-4" in Table 4) as the reference wavelength. MAD phases were 
calculated using SHARP [E. De La Fortelle and G. Bricoghe, Methods EnzymoL 276. 472 
(1997)]. An interpretable electron density m^ was readily obtained after density 
modification by solvent flipping (program SOLOMON [J. P. Abrahams and A. G. W. Leslie, 
Acta Crystallogr. D52. 32 (1996)]). The P^D subunit was rebuilt into the experimental 
electron density, starting from the apo-PapD structure. A Ca trace of the PapK subunit was 
built into the experimental electron density map usmg program O [T. A. Jones and S. Thiriip, 
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EMBOJ. 5. 819 (1986); T. A. Jones. J. Y. Zou, S. W. Cowan, M. Kjeldgaard, Acta 
Crystallogr. A47, 1 10 (1991)], accounting for all but 8 residues located at the bMj-tenninus, 
for which, even at later stages of the refinement, no electron density was observed. The 
electron density was of sufficient quality (Fig. 1) to unequivocally assign die sequmce. The 
model was then refined using CNSsolve 0.5 [A. T. BrOnger et al.. Acta Cystallogr. D54, 905 
(1998)] against the 'SeMet-S* structure fector amplitudes using the maximum likelihood 
refinement target with incorporation of experimental phase information (P. D. Adams, N. S. 
Pannu, R. J. Read, A. T. BrQnger, Proc Natl. Acad. ScL 94, 5018 (1997); N. S. Pannu, G. N. 
Murshudov, E. J. Dodson. R. J. Read, Acta Crystallogr. D54, 1258 (1998)]. Both positional 
and simulated annealing refinement in cartesian space were used (the temperature factors 
were set to 25 A^ and resulted in values of R- and fire-R of 27.4 and 32.5 %, respectively [A. 
T. Briinger, J. Mol Biol. 203, 803 (1988)]. After two rounds of rebuildmg, where simulated 
annealing omit maps were generated for ambiguous regions and used to adjust the model [A. 
Hodel, S.-H. Kim, A. T. BrQnger, Acta Crystallogr. A48, 851 (1992)], positional refinement 
followed by restrained refinement of tiie temperature &ctors resulted in a model with R and 
frce-R values of 24.3 and 28.8%, respectively. At this stage, 104 well-defined water 
molecules were added resulting in a final model with R- and firee-R values of 23.8% and 
27.4%, respectively. The stereochemistiy of the model is excellent and the taiq)eiature 
factors restrained s^propriately (Table 4). The model of Pa|}K is complete between residues 
9 and 157. Electron densiQr was poor for residues 216 to 218 of P^D and flierefore, this 
region was not included in the final model. Also, for the same reason, residues Arg^ and 
GIu" in P^D were built as alanines. All residues in PapK and PapD are located in either the 
most favored or the allowed regions of the Ramachandian plot (G. N. Ramachandran and V. 
Sasisekharan, Adv. Protein Chem. 23, 283 (1968)]. Coordinates have been deposited at the 
Protein Data Bank (entiy code IPDIQ. 

COOH-terminally truncated Ig fold of PapK. PapK has the same overall variable- 
region immunoglobin-like (Ig) fold as the amino-tenninal domain of PapD, with two beta- 
sheets coming together in a beta-sandwich (Figs. 2A and 3 A; see also Fig. 2A for secondary 
structure notation). However, the Ig fold of PapK is incomplete: it lacks the CCiOH-temiinal 
seventh strand, G, which in canonical Ig folds fomis an antiparallel beta-sheet interaction 
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with Strand F and contributes to the hydrophobic core of the protein. Remarkably, in the 
PapD-PapK co-complex, this missing strand is provided by Pq,D. which donates its G, beta- 
strand to complete the Ig fold of PapK (Figs. 2A. 2B, and 3A). The Ig fold thus produced is 
however atypical, since the donated strand runs parallel, rather than antiparaUel, to strand F in 
PapK. The insertion of the G, beta-strand into the fold of the pilin, coined as "donor strand 
complementation" has important implications for the mechanisms of subunit folding, capping 
and assembly. 

The first eight NHj-terminal residues of PapK are disordered. The Ig fold of PapK 
(Fig. 3A) begins with a short beta-strand, Al, which makes typical antiparaUel hydrogen 
bonds with the COOH-teraiinal residues of strand B. This short beta-sheet arrangement is 
interrupted by the insertion of a 3,o helical turn (Figs. 2A and 3B) which results in strand A 
switching sides in the beta-sandwich in order to make antiparaUel beta-strand interactions 
with the G, beta-strand of the ch^erone (Fig. 3A). Strands A and B are connected by a short 
-helix ( B in Figs. 2A and 3B) which precedes three successive aromatic residues (Phe", 
Trp^*, Tyr", Fig. 3B). While Phe'* inserts into the hydrophobic core of the beta-sandwich, 
Trp'^and Tyr^ interact closely with residues at the COOH-terminus of helix D (Fig. 2A), 
possibly contributing to its stability. Strand B fonns the edge of one of the two beta-sheets in 
the beta-sandwich and runs antiparaUel to strand E. Following strand B, the structure crosses 
over to the other side of the beta-sandwich through a short 3,o helix (Fig. 2A) to form strand 
CI, which runs antiparaUel to strand F. The COOH-terminus of strand CI deviates ftom the 
beta-sheet arrangement to form a protruding beta-meander (strands C and C"). Strand C 
reaches over to the other side of the beta-sandwich to form main-chain hydrogen bonds with 
strand Dl. This smaU beta-structure eventually returns, as 02, to mak^ main-chain hydrogen 
bonding interactions with strand F (Figs. 2A. 3A, and 3B). 

An extended loop Unks strand C to strand Di on the other side of the beta-sandwich. 
Strand D constitutes an edge of the D. E, B, Al beta-sheet. It therefore nms antiparaUel to 
strand E. However, strand D is divided in the middle by an insertion which meanders 
towanis the C, C" meander and reaches back to the E strand. Strand E is followed by a 
three-tum helix ( D) and a long loop stracture which connects it to the COOH-terminal strand 
F. Firially, strand F, from Asp'** onward, forms a parallel beta-sheet with strand G, of PapD 
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(Figs. 2A and 3A). Hence, strand G, of PapD is an integral part of the C, F, A2 beta-sheet of 
PapiL 

Structure of PapD in the PapD-PapK Co^omplex. Except for the F,-G, loop in the 
NHj-terminal domain (Figs, 3C and 4), the structure of PapD in the PapD-PapK co-complex 
siq)erimposes very well with apo PapD (r.m.s. deviation in C atom positions, excluding the 
F,-G, loop, of 0.65 A). Hence, the binding of PapK does not alter the orientation of the 
domains of PapD. The major difference between the apo and PapK*bound forms of PapD is a 
large conformational change in the F,-Gi loop of PapD. The tip of this loop undergoes a fls^ 
motion of about 1 1 A that results in an re-ordering of the F,-G, loop such that residues 101 to 
105 of PapD become part of the G, beta-strand. 

The PapD-PapK interface. The total buried surface area in the PapD-PapK co- 
complex is 3434 A^. There are two distinct sites on PapK that interact with two 
corresponding sites on P^D. Site KI of P^K interacts with a site on the NHj-temxinal 
domain (domain 1) of PapD (site Dl) and site K2 of PapK interacts with a site on the COOH- 
teraunal domain (domain 2) of PapD (site D2) (Fig. 5). 

Site Kl contains a deep groove which runs the length of the subunit The edges of the 
groove consist of stnmds A and F and its base is formed by the hydrophobic core of PapK 
(Figs, 6A, 6B and 6E). This groove is the result of the missing G beta-strand in the Ig fold of 
PapK. Site Dl includes residues 101 to 1 12 of the G, beta-strand of PapD, which insert into 
the Kl groove and make a beta-zipper interaction with strand F of PapK on one side of the 
groove. Residues 101 to 105 also make a beta-zipper interaction with strand A2 on the other 
side of the groove (Figs. 6 A and 6B). Insertion of the G, beta-strand also results in the 
formation of a continuous 5-stranded beta-sheet which includes strands C„ F„ and G, of 
PapD and F and CI of PapK (Fig. 2A)- The alternating hydrophobic residues in the G, beta- 
strand of PapD (Leu'", He'*", and Leu'^) interact with the hydrophobic base of the groove 
(Fig. 6E). Thus the donor strand complementation by the G| beta-strand of Ps^D shields the 
hydrophobic* core of the pilin from expostire to the aqueous milieu of the periplasm. 

The Kl-Dl interaction also involves contacts at the end of the groove nearest the cleft 
of the chaperone. These interactions consist of hydrophobic and polar contacts between the 
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AI Strand of PapK and the Al „ A2, and C, strands of PapD (Figs. 6A and 6B). The COOH- 
terminal carboxylate of PapK anchors the subunit into the cleft of PapD by hydrogen bonding 
to the invariant Azg* and L}^'" residues ofVzpD as well as to &e Oy hydroxyl of highly 
conserved Thr'" (Figs. 6C and 6D). 

Site K2 is formed primarily by residues in helix 3 ,oC and the COOH-terminal Arg'" 
side chain of PapK (Figs. 6C and 6D). This interface is less extensive than site Kl (455 A'). 
Residues in site K2 interact with residues in the C, and D, strands and with the Fj-Gj loop of 
domain 2 of PapD (Site D2). The K2-D2 interface includes hydrogen bonds between Thr"of 
PapK and the main-chain carbonyls of Glu'** and Glu'" of PapD, as well as polar and 
hydrophobic contacts involving Lys*' and lie*? of PapK and Arg^and He'** of PapD. 

Example 2: Preparation and com parison of FimA subnnitg 
from different strains of E. colir 

Genomic DNA was prqiared fiom overnight broth cultures of 59 uiopatfaogenic E. 

co/i strains using flie Puregoie DNA Isolation Kit (Minneapolis, MN). DNA was amplified 

by PGR using Taq polymerase (Peridn Ehner) using the following primers: 5*- 

CATCGCTGGCACAGGAAGGAGC-3' (SEQ ID NO: 53) and 

5'-GTTGGTATGACCCGCATCAATCGC-3* (SEQ ID NO: 54) Chat flank the /imA locus, 
under the following conditions : cycle 1 (95'C for 1 min ), cycle 2-30 (95''C for 30 sec., 50»C 
for 30 sec, 72°C for 2 min.) in the presence of 3.0 mM MgCl,. The FimA amplified 
fiagments were purified with a QIAquick Purification Kit (Qiagen, Germany), sequenced 
directly without subcloning using the dRhodamine Temiinator Cycle Sequencmg Kit (Perkin 
Elmer, Norwalk, CT) and analyzed on the ABI 373 Automated DNA Sequencer (PE Applied 
Biosystems, Foster City, CA). The FimA sequences were aligned and compared using ttxe 
Lasergene software program (DNAStar). 

Example 3: Structure of FimH in the Fi mH-FimC Co-rrv«tal 
FimH is folded into two domains of the all-beta class. The NHj-terminal mannose- 
biriding domain comprises residues IH - 155H, and the COOH-terminal pilin domain which 
is used to anchor the sulhesin to the pilus comprises residues 160H - 279H. A short extended 
linker (residues 157H - 159H) comiects the two domains. FimC in the co-complex has the 
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same overall structure as free FimC. The pilin domain of FimH binds in the cleft of the 
chaperone, but mostly to the chaperone*s NHj-tenninal domain. 

The lectin domain of FimH is an 1 1-stranded elongated beta-barrel with a jelly roll- 
like topology (Figure 8B). A pocket citable of accommodating a mono-mannose unit is 
located at the tip of the domain, distal from the connection to the pilin domain (Figure 9B). 
The bottom of the pocket is lined with asparagine, glutamine and aspartic acid residues in 
three loop regions which are typical carbohydrate binding side chains (Figure I OA). A 
molecule of cyclohexylbutanoyl-iV-hydroxyethyl-£>-gIucamide (C-HEGA) is bound in this 
pocket C-HEGA is not a known inhibitor of FimH mannose binding but was needed in the 
crystallization to produce useful co-crystals of FimC-FimH co-complex. The glucamide 
moiety of C-HEGA is blocked at Cl and cannot form a pyranose, but is bent to approach the 
pyranose conformation. The C2, C3, C4 and C6 hydroxyl groups of C-HEGA arc enclosed 
within the pocket, whereas the C5 hydroxyl and cyclohexylbuttooyl-Mhydioxyethyl groups 
point out from the pocket and are solvent exposed. Residues Asp**", Ghi"'", Asn'"", Asp'**" 
and the NHj-tenninal amino group of FimH (Figure lOA) are hydrogen bonded to the 
glucamide moiety of C-HEGA. FimH from a urinary tract E. coli isolate which has a lysine 
instead of asparagine at position 135H produces type 1 pill but is unable to mediate mannose 
sensitive hemagglutination of guinea pig erythrocytes (S. Langermann, unpublished results). 
Also, a mutation at residue 136H has been reported to completely block mannose binding. 
See Schembri et al., FEMS Microbiol. Lett., 137, 257 (1 996). 

The pilin domain of FimH has the same immunoglobulin-like topology as the NH^- 
terminal domain of FimC, except that the seventh strand of the fold is missing. Two anti- 
parallel beta-sheets (strands A'BED' and D"CF) pack against each other to form a beta-barrel 
that is similar to, but distinct from, immunoglobulin barrels. As in the chaperones, strand 
switching occurs at the edges of the sheets. In tfie chapoones, the Al strand of die NH,- 
terminal domain switches between die two sheets of the barrel. The first strand of the pilin 
domain exhibits a similar switch, but due to the lack of a seventh strand, the second half of 
the A strand is not involved in main chain hydrogen bonding within the domain. The D 
strand of the chaperones as well as of the FimH pilin domain also switches, but m the pilin 
domain the switch is an 8-residue loop instead of the cis-prolirie bulge found in the 
chaperones. The C-D loop and the D*-D" connection pack against each other and close the 
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top of the barrel. The other side of the barrel, defined by the A and F edge strands, is open. 
Due to the absence of a seventh strand a deep scar is created on the surface of the domain. 
Residues that would be part of the hydrophobic core of an intact, seven-stianded PapD-like 
domain instead line a deep hydniphobic crevice on the surfece of the pUin domain. 

Example 4; FimC-Fim H Co-crvstal Stnig^ ir«> 
FimC-FimH co-crystals were grown by hanging drop vapor diffusion by mixing 2 ftl 
of a protein solution (4 mg of FimC-FimH co-complex per milliUter pre-equiKabrated in 300 
mM of HEGA) with 2 ^l of reservoir solution containing 1 M ammonium sulfete in 0.1 M 
tris-HCl buffer (pH 8.2). The stracture of the.FimC-FimH co-complex was solved to 2.5 A 
(Table 5). Eight copies of the FimC-FhnH co-complex in the asymmetric unit were arranged 
as two sets of four molecules related by approximate 4, screw axes. Election density was 
exceUent for one set of molecules (Figure 9A). aUowing ^Ucants to trace the entire co- 
complex. For the second set of molecules, electron density was poorer but aUowed for 
unambiguous placement of a copy of the initiaUy traced co-complex. 

Two seleno-methionine FimC-FimH co-crystals were used to coUect MAD (W A. 
Hendrickson,S'cie«ce254:51 (1991)) dataon BM14 of the ESRF. Data were recorded at 
each of 3 wavelengths corresponding to the peak of the Se white line, the point of inflexion of 
the K absorption edge, and a remote wavelength using a MAR CCD detector. Data were 
reduced using the program HKL2000 (Z. Otwinowski and W. MinorrMethods in 
Enzymology" C. W. Carter. R. M. Sweet, Eds. (Academic Press, New York, 1997). vol. 276. 
pp. 307), with further processing and scaling using the CCP4 processing package (CCP4, 
^cto. Crysf. D50. 760 (1994)). 

The co-crystals used for the structure determination belong to the space group C2 with 
ceU dimensions a = 139.08 ± 0.2 A, b = 139.08 db 0.2 A. c - 214.49 ± 0.2 A. and beta = 89.97 
± 0.2 A. The co-crystals exhibit strong pseudo P4.2,2 symmetry. An initial solution to the 
Patterson function was produced in thfe tetragonal pseudo space group both automatically 
using the program SOLVE (T. C. TcrwiUiger and J. Berendzen, Acta. Cryst. D53, 571 
(1997)) and manually using the program RSPS (S. Knight, I. Andersson, C.-L BrSnden. J. 
MoL Biol 215: 1 13 (1990)). and initial phases calculated using SHARP (E. de la Fortelle and 
G. Bricogne, in Methods in Enzymology C. W. Carter, R. M. Sweet. Eds. (Academic Press, 
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New York. 1 997). vol. 276, pp. 472)). Density modification including 4-fold non- 
crystallographic (NCS) averaging was done using the program DM (K. D. Cowtan. Joint 
CCP4 ESF-EACBM Newsl. Protein Crystallogr. 31: 34 (1994)). A model corresponding to 
the two copies of the co-complex in the pseudo asymmetric unit was built using O (T. A. 
Jones et al.. Acta. Oyst. A47, 1 10 (1991)) modeled in 4-fold averaged electron density and 
refined against 2.5 A native data applying tight non-ciystallogn^hic restraints. The crystals 
are in either space group P4,2,2 or P4j, with ceU dimensions a = b = 97.7 ± 0.2 angstroms and 
c = 215.9 ± 0.2 angstroms. Bulk solvent correction, positional, simulated annealing, and 
isotropic temperature factor refinement has been carried out using X-PLOR (A. T. BrOnger, 
X-PLOR Manual (Version 3.1): A system for X-ray crystallography andNMR (Yale 
University Press, New Haven, CT, 1993)) and REFMAC (G. N. Murshudov, A. A Vagin, E. 
J. Dodson, Acta. Cryst. D53, 240 (1997)) with tight NCS restraints against a 2.5 A native data 
set collected at Max II/BL71 1 in Lund. The current R-factor and R-free (on 5% of the data) 
are 24.0% and 26.8%, respectively. The rjn.s. deviations fcom ideal bond length and angle 
values are 0.016 A and 3.3**, respectively. No residues are found in disallowed regions of the 
Ramachandran plot. The coordinates have been deposited at the Research Collabortory for 
Stractural Bioinformatics Protein Data Bank (code IQUN). 
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Example 4: FimC-Fim H Co-complex Structure ■ 
In the FimC-FimH co-complex, the seventh strand (G, beta-strand) from the NHj- 
terminal domain of the FimC chaperone is used to complement the pilin domain by being 
inserted between the second half of the A strand and die F strand of the domain (Figure lOQ. 
Thus, the final strand (F) of FimH forms a parallel beta-strand interaction with the GV strand 
of FimC and has its COOH-temiinal caiboxyl group anchored in die crevice of the ch^eione 
cleft through hydrogen bonding with the conserved residues Ai^ and Lys"*= in FimC 
(Figure 9A). 

The G, beta-strand of the FimC chaperone contains a conserved motif of solvent 
exposed hydrophobic residues at positions 103. 105, and 107. In the FimC-FimH co- 
complex, these residues are used to complete the unfinished hydrophobic core of FimH 
(Figure IOC). The two residues Leu'«^ and Leu"»«= are deeply buried in the crevice created in 
the FimH pilin domain due to the missing seventh strand. ne"*'<= is somewhat closer to the 
domain surface but makes van der Waals contacts with residues Val'*"* and Phe"*". Lea}^ 
contacts residues De'""". Val^«. Leu^ and UcT^. Leu««<^ is in contact with He'"". Leu'»« 
Leu"*", ne^ and Val"*« This mode of binding is cafled "donor strand con^lementation" 
to emphasize the feet that the pilin domain is inconqilete and that the chaperone donates its 
Gl beta-strand to complete the fold of the pilin. 

Example Si Snhnnft-^nbunit inte ractions in Type 1 Pili 
Genetic, biochemical and electron microscopic studies have demonstrated tiiat 
residues in the two conserved motifs (the COOH-terminal F strand and an NHj-terminal 
motif) participate in subunit-subunit interactions necessary for pilus assembly. See G.E. Soto 
et al., EMBOJ., 17: 6155 (1998). An alignment of the pilin sequences, based on die FimC- 
FimH co-crystal structure, revealed tfiat the NHj-terminal motif was part of a 10-20 residue 
NH,.terminal extension diat was missing in die FimH piliA domain (Figure 8A). This region 
contains a highly conserved pattwn of alternating hydrophobic residues (highlighted in Figure 
8A) similar to the donor G, beta-strand of die chaperone. This motif is structuraUy analogous 
to die Gl donor strand motif of die cha|)erone and molecular modeling indicates tiiat it would 
be able to fit into die same groove occupied by die donor G, beta-strand of die chaperone 
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The type 1 pilus is a right handed helix with about 3 subunits per turn, a diameter of 
approximately 70 A. a central pore of about 20-25 A, and a rise per subunit of about 8 A. In 
order to obtain this structure, insertion of the NHj-tenninal extension must be antiparallel to 
strand F in contrast to. the parallel insertion observed for the G, beta-strand of the chapeione. 
Insertion m a paraUel orientation would lead to rosette-like structuies. One edge of the pilin 
groove is lined by the COOH terminal F strand which has been shown to form a critical part 
of the subunit tail. Thus, die NHj-terminal extension represents the head of a subunit and 
during pilus biogenesis, it would displace the donor G, beta-strand of the chaperone to fit into 
the tail groove of a neighboring subunit and to complete the pilin fold of its neighbor in a 
donor strand complementation mechanism. . 

Using the FimH piUn dornain as a model for FimA, applicants constructed a model for 
the type I pilus that fit these data (Figure 1 1). Each subunit was aligned to have its cleft 
facing towards the center of the pilus so that the height firom the top to the bottom of the 
domain along the heUx axis was approximately 25 A. Applying a rotation of 1 15 degrees and 
a rise per subunit of 8 A. a hollow helical cylinder is created. The outer diameter of this 
cylinder as measured across Q atoms is 70 A. and the inner diameter is 25 A. FimA subunits 
&om different strams of £. coli exhibit considerable allelic variation. The vast majority of the 
variable positions are on the outside surface of die pilus model proposed above (Figure 1 1) 
which would account for the antigenic variability of type 1 pili. 

The proposed head-to-tail interaction between subunits in a pilus is reminiscent of 
oligomerization through three-dunensional domain swapping in the sense that a part of the 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g., FimG, FimF and FimH in the pilus tip. Furthermore, because 
individual pilins promoters do not exist as stable monoiners, there is no exchange of 
structural units between a monomeric and an oligomoeric state. Instead, a different protein, 
the periplasmic chaperone, is needed to keep the monomeric subunits in solution by donating 
a unique part of its structure (the G, beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex, pilins are missing the 
necessary steric information needed to fold into a native three dimensional structure. The 
information that is missing consists of the seventh edge strand of an immunoglobulin fold. 
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This strand, which is necessary for folding, is donated to the hydrophobic core of the pilin by 
the penplasmicchaperone in a donor strand complementation mechanism. Thus, the steric 
mformation necessary for newly synthesized protein chains to fold correctly is not inherent in 

the sequence of the protein to be folded; however, such infomaation is instead transferred 
from another protein, the periplasmic chaperone. 

E^ffiqpje 6; FlmH Binding to Fimr «nH P jmG hv WJXSIa 
The ability of FimH to bind to peptides corresponding to the G, beta-strand of FimC 
and the N^terminal extension of FimG was tested using an ELISA assay. During pilus 
assembly, the G. beta-strand of FimC completes the Ig fold of the FimH pilin domain in the 
penplasm and then in the pilus the N-terminal extension of FimG completes the Ig fold of the 
FunH pilin domain. 

In order to assess the abiUty of FimH to bind to the two peptides. FimH was purified 
fiom the FimCFimH co-complex. Synthetic peptides were synthesized corresponding to the 
G, beta-strand of FimC and the N-terminal extension of FimG. The synthesized peptide 
sequences are as foUows: FimC pq,tide. NTLQLAnSR (SEQ ID NO: 55) and FimG peptide 

DVTTTVNGK(SEQIDNO:56). Stock solutions of the peptides (5 mg/ml) were dissolved ' 
in DMSO. 

T^e peptides were diluted in phosphate buffered saline (PBS) (120 mM NaCl, 2.7 mM 
KCT. lOmM. 10 mM PBS. pH 7.4) to 2 mnol/5M. FimC protein was diluted to 0.1 
nmol/50m and coated overnight onto microtiter weUs with 50 ulAveU at 4«C TTie ELISA 
assay was carried out as described in Kuehn et al.. 1993 and Hung et al.. 1996. Briefly the 
wells were washed three times with PBS and blocked with 3% Bovine Serum Albumin' 
(BSA) m PBS for two hours at 25''C. Then the wells were washed three times with PBS The 
FmiC-FmiH co-complex was incubated in 3 M urea to separate the two proteins. Pure FimH 
m 3 M urea was collected fiom the flow through of a Source 15S column (Pharmacia) See 
Bamhart et al.. PNAS USA 97: 7709-7714 (2000). The wells were incubated with 50^1 of 
FunH in 3% BSA-PBS diluted to 5-25 pmol/well FimH for 45 minutes at 25'C. Tbc wells 
were washed 3 times with PBS followed by incubation with a 1 :1000 dilution of mouse anti- 
FmiH antibodies in 3% BSA-PBS for 45 minutes at 25»C The wells were washed 3 times 
with PBS followed by incubation with a 1 .1000 dilution of goat antiserum to mouse IgG 
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(Sigma) conjugated to alkaline phosphatase diluted in 3% BSA-PBS for 45 minutes at 25^. 
The wells were washed 3 times with PBS and washed 3 times with developing buffer (10 mM 
diethanolamine. 0.4 mM MgCU). The ELISA was developed by adding 50^1 of substrate 
(50nl of filtered 1 mg/ml p-nitrophenyl phosphate; Sigma) in developing buffer. The reaction 
was incubated for 1 hour at 25*C in the dark and the absorbance at 405 nm was read. 

The competition assays were carried out similarly. FimC was coated onto microliter 
wells at 0.1 nmol/well. FimH at 5 pmol/well in 3% BSA-PBS was added to the FimC coated 
wells in the presence or the absence of the FimC or FimG peptide at 2 nmol/weU or the 
indicated peptide concentration. Further, increasing concentrations of FimH were incubated 
with constant concentrations of the FimC or FimG peptides or the FimC protein immobilized 
1 miciDtiter wells. FimH bound weU to both pure FimC piptein immobilized on microtiter 
wells (Fig. 12) and to the peptides corresponding to the G, beta-strand of FimC and the N- 
tenninal extension of FimG (Figure 12). Next, the abiUty of the peptides to inhibit FimH 
binding to FimC was tested. FimH was added to the FimC coated wells in the presence or 
absence of peptides to FimC or FimG. Increasing concentrations of the FimC peptide further 
ecreased the ability of FimH to bind to FimC immobilized on microtiter wells (Fig. 1 3). 
The FimC peptide inhibited die abiUty of FimH to bind to FimC immobilized on the 
microtiter wells (Fig. 14); however, the FimG peptide at the tested concentration did not 
inhibit the ability of FimH to bind to FimC (Fig. 14). 

Other features, objects and advantages of the present invention will be ^parent to 
those skilled in the art. The explanations and illustrations presented herein are intended to 
acquaint others skilled in the art with the invention, its principles, and its practical 
appUcation. Those skilled in the ait may adapt and apply the mvention in its numerous 
forms, as may be best suited to the requirements of a particular use. Accordingly, the specific 
embodiments of the present invention as set forth are not intended as being exhaustive or 
limiting of the present invention. 
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We claim; 

1 . An isolated compound which binds to a pilus subunit groove thereby inhibiting pilus 
assembly. 

2. The compound of claim 1 wherein the conq]ound is a peptide. 

3. The compound of claim 1 wherein the compound is a non-peptide compound. 

4. The compound of claim 1 further comprising a mimic of an amino-tenninal motif of a 
pilus subunit with at least two alternating hydrophobic amino acid residues which 
mimic exhibits antibacterial activity against a Gram-negative bacterium. 

5. The compound of claim 1 further comprising a mimic of a chaperone G, beta-strand 
with at least two alternating hydrophobic amino acid residues which exhibits 
antibacterial activity against a Gram-negative bacterium. 

6. The compound of any one of claim 1-5 wherein the conqjound has been modified to 
improve binding, specilScity, solubility, safety or efficacy. 

7. The compound of claim 1 which is a 10 to 20 residue peptide or peptide analog 
according to formula (I): 

(I) Z,-Z2-X,-X,-X3-X«-X5-X.-Xr-Xg-X,-X,o-Z,~Z, 

or a pharmaceutically-acceptable salt thereof, wherein; 
Z, is R-C(0)-NR- or RRN-; 

Zj is an optional 1 to 5 residue peptide or peptide analog; 
X, is any amino acid residue; 
X3 is any amino acid residue; 

X, is a hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
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X4 is any amino acid residue; 

X5 is a hydrophobic residue or Gly; 

is a hydrophobic or a hydrophilic residue; 
X7 is Gly, an amide-substituted polar residue or a hydrophobic residue; 
Xg is any amino acid residue; 
X9 is an aliphatic residue; 
X|o is any anuno acid residue; 

Z3 is an optional 1 to 5 residue peptide or peptide analog; 
Z4 is -C(0)OR or -<:(0)ISfRR; 

each R is mdependently hydrogen, (CpC^) alkyl, (Q-CJ alkenyl, (Q-C^) 
aUcynyl or (Q-C,4) aryl; 

; each;'-" between residues X, through X,o, Zj and X, and X,o and Z3 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage; and 

each represents a bond. 

Si, 

The compound of claim 7 wherein said conq)ound further comprises one or more 
features selected from the group consisting of: 

each between residues X, through X^o, Zj and X^ and X,o and is an 
amide linkage; ; ^ 

Z, isHjN-; / * 

Z4 is -<:(0)0H or a salt thereof; , 

optional Zj is not present; • 

optional Zj is not present; 

X, is other than a basid reisidue; 

X2 is other than an aliphatic residue;' 

X3 is an aliphatic residue'or T; 

X4 is other than an acidic residue; 

X5 is an aliphatic residue, F or G; 

X^isaNorA; 

X, is other than an aliphatic residue; and 
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10. 



X,o is an aliphatic or a polar residue. 

The compound of claim 8 which is selected ftom the group consisting of SEQ ID NO- 
2, SEQ ID NO: 3. SEQ ID NO: 4, SEQ ID NO: 5. SEQ ID NO: 6. SEQ ID NO: 7. 
SEQ ID NO: 8. SEQ ID NO: 9. SEQ ID NO: 10. SEQ ID NO: 1 1, SEQ ID NO: 12. 

SEQ ID NO: 13, SEQ ID NO: 14. SEQ ID NO: 15, SEQ ID NO: 16. SEQ ID 1^0: 17. 

SEQ ID NO: 18. SEQ ID NO: 19. SEQ ID NO: 20, SEQ ID NO: 21. SEQ ID NO: 22,' 

SEQ ID NO: 23, SEQ ID NO: 24. SEQ ID NO: 25. SEQ ID NO: 26. SEQ ID NO: 27.' 

SEQ ID NO: 28 and SEQ ID NO: 29. 

The compound of claim 1 ^ch is a 7 to 17 residue peptide or peptide analog 
according to fbnnula (n): 

(n) z„~z„-x„-x«-x„-x„-x.^x,^x,r-2„~z,4 

or a phannaceuticaUy-acceptable salt theteo^ wherein: 
Z„ is R'-C(0)-NR'-or R-R-N-; 

Z,j is an optional 1 to 5 residue peptide or peptide analog; 

X„ is any amino acid residue; 

X|2 is any amino acid residue; 

X,3 is a hydrophobic residue; 

X,4 is any amino acid residue; 

X,5 is a hydrophobic residue; 

X,4 is any amino acid residue; 

Xn is hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
Z,3 is an optional 1 to 5 residue peptide or peptide analog; 
Zu is -C(0)OR' or -C(0)NR'R'; 

each R- is independently hydrogen, (C,-C«) alkyl. (C,-C«) aDcenyl. (C,-C«) 
aUcynyIor(Q-C,JaiyI; ' . 

each between residues X„ through X„. 2„ and X. , and X,7 and Z„ 
independently represents an amide Unkage, a substituted amide linkage or an isosteie 
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of an amide likage; and 

each independently represents a bond, 

11. The compound of claim 10 wherein said compound further comprises one or more 
features selected fiom the group consisting of: 

each --" between residues X„ through X„, Z« and X„ and X„ and Z,, is an 
amide linkage; 

Z„isH,N-; 

Z,4 is -C(0)OH or a salt thereof; 
optional Z,, is not present; 
optional Z,j is not present; 
■X, I is other than a basic residue; 
X,3 is an aliphatic residue CH-M; 
X,4 is other than an annnatic residue; 
X,5 is an alq)hatic residue, F orM; and 

X„ is an aUphatic residue. F. M or a hydroxyl-substituted aliphatic residue. 

12. The compound of claim 1 1 which is selected fixjm the group consisting of SEQ ID 
NO: 1, SEQ ID NO: 30, SEQ ID NO: 31. SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID 
NO: 34. SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID 
NO: 39. SEQ ID NO: 40. SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43. SEQ ID 
NO: 44. SEQ ID NO: 45. SEQ ID NO: 46. SEQ ID NO: 47. SEQ ID NO: 48, SEQ ID 
NO: 49, SEQ ID NO: 50. SEQ ID NO: 51 and SEQ ID NO: 52. 

13. TTie compoundof any one of claims 1-12 wherein said compound exhibits 
antibacterial activity against a Gram-aegative bacterium comprising Escherichia coli. 
Haemophilus influenzae. Salmonella enteriditis. Salmonella typhimurium. Bordetella 
pertussis. Yersinia pestis. Yersinia enterocolitica. Helicobacter pylori and Klebsiella 
pneumoniae. 



14. A mannose analogue capable of competitively binding the amino terminal 



mannose- 
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16. 



17. 



8. 



9. 
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binding domain of a Gram-negative bactwial adhesin. 

The analogue of claim 14 wherein said compound exhibits antibacterial activity 
against a Gram-negative bacterium comprising Escherichia coli, Haemophilm 
influenzae. Salmonella enteridUis. Salmonella typhimurium. Bordetella pertussis. 
Yersinia pestis. Yersinia enterocolitica. Helicobacter pylon and Klebsiella 
pneumoniae. 

A composition comprising a compound acconiing to any one of claims 1-15 and a 
phaimaceuticaUy acceptable earner, excipient or dUuent 

A method of preventing or inhibiting formation of a pilus subunit-subunit structure in 
a subject, said method comprising administering an effective anwmit of a compound 
acconiing to any one of claims 1-13. 

A method of preventing or inhibiting formation of a ch^q,erone-subunit structure in a 
subject, said method comprising administering an effective amount of a compound 
according to any one of claims 1-13. 

A method of treating a bacterial infection comprising administering to a subject in 
need thereof an effective amount of a compound according to any one of claims 1-15. 



The method of claim 19 wherein tiie bacterial infection is caused by comprising 
Escherichia coli. Haemophilus influenzae. Salmonella enteriditis. Salmonella 
typhimurium. Bordetella pertussis. Yersinia pestis. Yersinia enterocolitica. 
Helicobacter pylon and Klebsiella pneumoniae. 

The method of any one of claims 17-20 wherein tiie subject is a mammal or human. 
The method of any one of claims 17-20 wherein the subject is a plant. 
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23. A method of preventing or inhibiting pili adhesion to a host tissue, said method 
comprising administering a mannose analogue of claim 14 or 15. 

24. A method of preventing or inhibiting biofihn formation, said method comprising 
administering an effective amount of a compound of any one of claims 1-15 to an 
envirotunent or surface containing Gram-negative bacteria. 

25. A method for inhibiting bacterial colonization by a Gram-negative organism, said 
method comprising administering an effective amount of a compound of any one of 
claims 1-15 to an environment or sur&ce containing Gram-negative bacteria. 

26- A composition comprising a pilus diaperone-subunit co-complex in crystalline form, 
wherein said co-complex comprises an amino acid sequence of a G, beta-strand of a 
chaperone and an amino acid sequence of an amino-terminal end of a pilus subunit. 

27. The composition of claim 26 wherein said amino acid sequence of the G, beta-strand 
of the chaperone is derived from a NlOl to L107 amino acid region of the Gt beta- 
strand of a chaperone. 

28. The composition of claim 27 wherein the amino acid sequence derived from a G, 
beta-strand of a chaperone is SEQ ID NO: 1 . 

29. The composition of any one of claims 26-28 wherein the amino acid sequence derived 
from an amino acid sequence of an amino-terminal end of a pilus subunit is SEQ ID 
NO: 12. 

The composition of claim 26 wherein the pilus chaperone-subunit co-complex in 
crystalline form is a PapD-PapK ch^erone-subunit co-complex. 
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The composition of claim 30 wherein the crystal has a space group of P2,2,2, with 
unit cell dimensions of a = 62. 1 ± 0.2 angstroms, b « 63.6 ± 0.2 angstroms and c » 
92.7 ± 0.2 angstroms. 

The composition of claim 31, wherein said crystal is of diffraction quality. 

The composition of claim 31, wherein said crystal is a native crystal. 

The composition of claim 31, wherein said crystal is a heavy-atom derivative crystal. 

The composition of claim 3 1, wherein at least one of PapD or PapK of the PapD- 
PapK chaperone-subunit co-complex is a mutant 

The crystal of claim 35, wherein the mutant is a selenomethionine or selenocysteine 
mutant 

The crystal of claim 35, wherein the mutant is a conservative mutant . 

The crystal of claim 35, wherein die mutant is a truncated or extended mutant 

The composition of claim 3 1 , wherein said crystal is produced by a method 
comprising the steps of: 

(a) mixing a volume of a solution conq»ising the PapD-PapK chaperone- 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
in a closed container, imder conditions suitable for crystallization until ■ 
the crystal forms. 

A method of producing a PapD-PapK chaperone-subunit co-complex in crystalline 
form, said method comprising: 
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(a) 



mixmg a volume of a solution comprising the PspD-P^K chapeione- 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
in a closed container, under conditions suitable for crystallization until 
the crystal forms. 

A method of identifying an antibacterial compound, comprising the step of using a 
three-dimensional structural representation of a pilus chaperone-subunit co-complex, 
or a fragment thereof comprising a G, beta-strand binding cleft, to computationally 
screen a candidate compound for an ability to bind the G, beta-strand binding cleft of 
the pilus subunit. 

The method of claim 41 further comprising the steps of: 
synthesizing ihe candidate confound; and 
screening the candidate compoimd for antibacterial activity. 

The mediod of claun 42 wherein the three dimensional structural information 
comprises the atomic structure coordinates of a PapK subunit. 

The method of claim 44 wherein the three dimensional structural information further 
comprises the atomic structure coordinates of residues comprising the G, beta strand 
binding cleft of a P^K subunit. 

The method of claim 43 or 44 wherein the atomic structure coordinates are obtained 
from the atomic structure coordinates of a PapD-PapK chaperone-subunit co-complex. 

The method of claim 45 wherein the PapD-PapK co^omplex atomic structure 
coordinates are &ose coordinates deposited at the Protein Data Bank under entry code 
IPDK. 

The method of claim 42 wherein the structural information comprises the atomic 
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Structure coordinates of a FimH subunit. 



48. The method of claim 47 wherein the structuial information further comprises the 

atomic structtire coordinates of residues comprising a G. beta-stiand binding cleft of a 
FimH subunit 



The method of claim 47 or 48 wherein the atomic structure coordinates are obtained 
fiom the atomic structure coordinates of a FimC-FimH chaperone-adhesin co- 
complex. 



^0. The method of claim 49 wherein the atomic structure coordinates are those 

coordinates deposited at the Research Collaboratory for Structural Bioinformatics 
Protein Data Bank under entry code I QUN. 



A method of identifying an antibacterial compound comprising the step of usW a 
three-dimensional structural representation of a pilus chaperone-subunit co-complex, 
or a fragment thereof comprising a G, beta-strand binding cleft, to computationally 
design a synthesizable candidate compound that binds the G, beta-strand binding cleft 
of . a pilus subunit 

The method of claim 51 wherein the computational design comprises the stq)s of: 
identifying chemical entities or fragments capable of associating with the G, 

beta strand binding cleft of the ch^etone subunit; and 

assembling the chemical entities or fragments into a single molecule to 

provide the structure of the candidate conq>ound. 

The method of claim 52 frurther comprising the steps of: 
synthesizing the candidate compound; and 
screening the candidate compound for antibacterial activity. 

The method of claim 53 wherein the structural infonnation comprises the atomic 
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Structure coordinates of a PzpK subunit. 

55- The method of claim 54 wherein the structural information further comprises the 

atomic structure coordinates of residues comprising the G, beta-strand binding cleft of 
a Ps^K subunit 

56. The method of claim 54 or 55 wherein the atonuc structure coordinates are obtained 
Gcom the atomic structure coordinates of a PapD-PapK chaperone-subtmit co-complex. 

57. The method of claim 56 wherein the atomic structure coordinates of the Ps^jD-PapK 
co-complex are those coordinates deposited at the Protein Disita Bank under entry code 
IPDIC 

58. The method of claim 53 wherein the structural information comprises the atomic 
stmcture coordinates of a FimH subunit 

59. The method of claim 58 wherein the structural infomiation comprises the atomic 
structure coordinates of residues comprising a G, beta-strand binding cleft of a FimH 
subunit 

60. The method of claim 58 or 59 wherein the atomic stmcture coordinates are obtained 
from the atomic structure coordinates of a FimC-FimH chaperone-adhesin co- 
complex. 

61. The method of claim 60 wherein the atomic stmcture coordinates of the FimC-FimH 
chaperone-adhesin are those coordinates deposited at the Research CoUaboratory for 
Stmctural Bioinformatics Protein Data Bank under entry code IQUN. 



62. 



A method of identifying a compound having antibacterial activity, comprising the step 
of using a three-dimensional stmctural representation of a chaperone, or a fragment 
thereof comprising a G, beta-strand, to identify or design a compound having a three- 
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dimensional struc.jre similar to the three-dimensional structure of the G, beta-strand 
of thechaperonie. 

The method of claim 62 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of residues comprising a G, beta-strand of 
a PapD chaperone subunit or a FimC chaperone. 

The method of claim 63 wherein the three-dimensional structural infomiation 
comprises the atomic structtare coordinates of a PapD chaperone. 

The method of claim 63 or 64 wherein the atomic structure coordinates of the PapD 
chaperone are obtained from the atomic structure coordinates of a PapD-PapK 
ch^ierone-subunit co-complex. 

The method of claim 65 wherein the atomic structure coordinates of the PapD-PapK 
chaperone-subunit co-complex are those deposited at the Protein Data Bank under 
entry code IPDK. 

The method of clahn 63 wherein the three-dimensional structural infonnation 
comprises the atomic structure coordinates of a FimC ch^erone. 

The method of claim 67 wherein the atomic structure coordinates of the FimC 
chaperone are obtained from the atomic structure coordinates of a FimC-FimH 
chaperone-adhesin co-conq>lex. 

The method of claim 68 wherein the structure coordinates of the FimC-FimH 
chaperone-adhesin co-complex are those deposited at the Research CoUaboratory for 
Structural Bioinformatics Protein Data Bank under entry code IQUN. 

A method of identifying an antibacterial compound, said method comprising the step 
of using a three-dimensional structural representation of an adhesin, or a fragment 
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thereof comprising a lectin binding domain or portion thereof, to screen a candidate 
compound for the ability to bind a lectin binding domain of the adhesin. 

The method of claim 70, further conqjrising the steps of: 

synthesizing the candidate compound; and 
assaying the candidate confound for antibacterial activity. 

The method of claim 71 wherein the three-dimensional structural infoimation 
conqirises the atomic structure coordinates of a FimH adhesin. 

The method of claim 72 wherein the three-dimensional structural infonnation further 
comprises the atomic structure coordinates of residues comprising a lectin binding 
domain of a FimH adhesin or portion thereof. 

The method of claim 72 or 73 wherein the atomic structure coordinates are obtained 
from the structure coordinates of a FimC-FimH ch^erone-adhesin co-compl«c. 

The method.of claim 74 wherein the structure coordinates of the FimC-FimH 
chaperone adhesin co-complex are those deposited at ttie Research Collaboratory for 
Structural Bioinfoimatics Protein Data Bank under entry code IQUN. 

A method of identifying an antibacterial compound comprising the step of using a 
three-dimensional structural representation of an adhesin, or a fragment thereof 
comprising a lectin binding domain or portion thereof, to computationally design a 
compoimd that binds the lectin binding domain of the adhesin. 

The method of claim 76 wherein the computational design comprises the steps of: 
identifying chemical entities or fragments capable of associating with the 
lectin binding domain; and 

• assembling the chemical entities or fragmoits into a single molecule to 
provide the structure of the candidate compound. 
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78. The method of claim 77, further comprising the steps of: 

synthesizing the candidate conqwimd; and 

screening the candidate compound for antibacterial activity. 

79. The method of claim 78 wherein the three^ensional structural infonnation 
comprises the atomic structure coordinates of a FimH adhesin. 

80. The method of claim 79 wherein the three^ensional stmctural infonnation further 
comprises the atomic structure coordinates of residues conq,rising a lectin binding 
domain of a FimH adhesin. 

81. The method of claim 79 or 80 wherein the atomic stature coordinates are obtained 
from the stmcture coordinates of a FimC-FimH ch^ne-adhesin co^mplex or 
portion thereof. 

82. The method ofclaim 81 wherein the structure coordinates of the FimG-FimH 
chaperone-adhesin co^wmplex are those deposited at the Research Collaboratoiy for 
Structural Bioinformatics Protein Data Bank under entry code IQUN. 



83. 



A machine-readable medium embedded with information that corresponds to a three- 
dimensional strucUiral representation of a crystalline pilus ch^erone-subunit co- 
complex or a fragment or portion diereof. 



84. The machine-readable medium of claun 83 wherein the pilus chaperone-subunit 
complex is a PapD-PapK chaperone-subunit co-complex. 



co- 



85. The machine-readable medium of claim 84 wherein at least one subunit of the PapD- 
PapK co-conipl«c is a mutant 

86. The machine-readable medium ofclaim 85 wherein the mutant is a selenomethionine 
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or selenocysteine mutant 

87. The machine-readable medium of claim 85 wherein the mutant is a conservative 
mutant. 

88. The raachine-readable medium ofclaim 84, in which the information comprises 
atomic structure coordinates, or a subset thereof 

89. The machine-readable medium of claim 88 wherein the atomic structure coordinates 
are those deposited at the Protein Data Bank under entry code IPDK. or a subset 
thereof 

90. The machine-readable medium of claim 83 wherein the pilus chapcrone-subunit co- 
complex is a FimC-FimH chq}aone-adhesin co-comploc. 

91. The machine-readable medium of claim 90 wherein at least one subunit of the FimC- 
FimH cluqjerone-adhesin co-complex is a mutant 

92. The machine-readable medium of claim 91 wherein the mutant is a selenomethionine 
or selenocysteine mutant 

93. The machine-readable medium of claim 91 wherein the mutant is a conservative 
mutant. 

94. The machine-readable medium of claim 90, in which the information comprises 
atomic stmcture coordinates, or a subset thereof 

95. The machine-readable medium of claim 94 wherein the atomic structure coordinates 
are those deposited at the Research Collaboratory for Structural Bioinformatics 
Protein Data Bank under entry code IQUN, or a subset thereof 
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SEQUENCE USTING 
<1 10> WASHINGTON UNIVERSITY 

<120> ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 
BIOGENESIS, ADHESION AND ACnVITY; CO-CRYSTALS OF PILUS 
SUBUNTTS AND METHODS OF USE THEREOF 

<130>WSHU2005.2 

<140> 
<141> 

<150>US60/148;280 
<151> 1999-08-11 

<160>56 



<170> Patenflh Ver. 2.1 

<210>1 
<211>7 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Descrq>tipn of Artificial Sequence: Syntiiesized 
Sequence 

<400> 1 

Asn Val Leu Gin lie Ala Leu 
15 



<210>2 

.<211>10 

<212>PRT 

<^ 1 3> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>2 

Gly Lys Val Thr Phe Asn Gly Thr Val Val 
15 10 



<210>3 

<211>10 

<212>PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>3 

Gly Thr Val His Phe Lys Gly Glu Val Val 
1 5 10 



<2I0>4 

<211>10 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
SequCTce 

<400>4 

Gly Lys Val Thr Phe Phe Gly Lys Val Val 
15 10 



<210> 5 

<211>10 

<2I2>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>5 

Gly Thr lie Val lie Thr Gly Thr lie Thr 
1 5 10 



<210>6 

<211>10 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>6 

Gly Thr ne Val He Thr Gly Ser lie Ser 
1 5 10 
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<210>7 

<21l>10 

<212>PRT 

<2 1 3> Artificial Sequence 
<220> 

'<^3> Description of Artificial Sequoice: Syn&esized 
Sequence 

<400>7 

Gly Thr Val Lys Phe Val Gly Sct He He 
1 5 10 



<210>8 

<211>10 

<212>PRT 

<21'3> Artificial Sequoice 
<220> 

<223> Desaiption of Artificial Sequoice: Synthesized 
Sequoice 

<400>8 

Gly Glu He Gin Leu Lys Gly Glu Be Val 
1 5 10 



<210>9 
<211> 10 
<212>PRT 

<2 1 3> Artificial Sequence 
<220> 

<^23> Description of Artificial Sequence: Synthesized 
Sequence 

<400>9 

Gly Thr He Lys Phe Thr Gly Glu lie Val 
15 10 



<210> 10 
<>11> 10 
<212>PRT 

<213> Artificial Sequoice 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 10 

Asn Glu Val Thr Phe Leu Gly Set Val Sa 
1 5 10 



wo 01/10386 



4 



PCTAISOO/22087 



<2io>n 

<211> 10 
<212>PRT 

*^13> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syntibesized 
Sequence 

<400>11 

Gly Thr He Asn Phe Glu Gly Ser Val Val 
1 5 10 



<210>12 
<211>10 
<212>PRT 

<2i3> Artificial Sequence 

<220> ■ , . 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>12 

Ser Asp Val Ala Phe Are Gly Asn Leu Leu 
1 5 10 



<2lO>13 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synfliesized 
Sequence 

<400>13 

Gly Arg Ala Ala Phe His Gly Glu Val Val 
1 5 10 



<210> 14 
<211>10 
<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>14 

Gly Arg Ala Thr Phe His Gly Glu Val Val 
I 5 . 10 
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<210>15 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400 15 

Asp Asn Leu Thr Phe Arg Gly Lys Leu He 
15 10 



<210>16 
<211>10 
<212>PRT 

<213> ^tifidal Sequmce 
<220> 

<223> Descr^tion of Artificial Sequence: Synthesized 
Sequence 

<400>16 

Asp Asn Leu Thr Phe Lys Gly Lys Leu He 
1 5 10 



<210>17 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syndiesized 
Sequence 

<400> 17 

Gly Trp Leu Asn Leu Gin Gly Thr De Leu 
1 5 10 

<2i0>18 
<211>10 
<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Syntiiesized 
Sequence 

<400>18 

Ser Val Val Asn ne Thr Gly Asn Val Ghi 
1 5 10 
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<21019 
<211>10 
<212>PRT 

<213>Aztificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
SequCTce 

<400>19 

Thr Thr lie Thr Val Thr Gly Asn Val Leu 
1 5 10 



<210>20 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syn&esized 
Sequence 

<400> 20 

Thr Thr He Thr Val Thr Gly Are Val Leu 
1 5 10 



<210>21 
<211>10 
<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>21 

Cys Met Leu Ala Gly Ser Asn Phe Val Thr 
15 10 



<210>22 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequoice: Synftesized 
Sequence 

<400>22 

Val Ghi Be Asn He Arg Gly Asn Val Tyr 
1 5 10 
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<21023 
<211>10 
<212>PRT 

<213> Artificiai Sequence 

<220> 

, <223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>23 

Pro Asn Leu Lys Leu Phe Gly Thr Leu Leu 
15 10 



<210>24 
<211>10 
<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>24 

Val Tyr ne Asn ne Thr Gly Asn Val He 
1 5 10 



<210>25 
<211>10 
<212>PRT 

<2 1 3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>25 

Gly Lys He Thr Phe Asn Gly Lys Val Val 
1 5 10 



<210>26 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> . 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>26 

Gly Thr He Asn Phe Asn Gly Lys He Thr 
- 1 5 10 



■I 
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<210>27 
<211>10 
<212>PRT 

^13> Artificial Sequence 
<220> 

<n3> Description of Artificial Sequence: Syn&esized 
Sequence 

<400>27 

Gin Lys Thr lie Phe Ser Ala Asp Val Val 
15 10 



<210>28 
<211>10 
<212>PRT 

<213> Artificial Sequoice 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>28 

Gly Gin Val Asn Phe Phe Gly Lys Val Thr 
1 5 10 



<210>29 
<211>10 
<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequoice: Synthesized 
Sequence 

<400>29 

Ghi Arg Thr ne ne Thr Ala Asp Val Val 
15 10 



<210>30 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequoice 

<40O>30 - 
Gly Sor Leu Ser Loi Ala lie 
15 
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<210>31 

<211>7 
<212>PRT 

<2 13> Artificial Sequoice 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>31 

Asn Tyr Leu Gin Phe Ala He 
15 



<210>32 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequoice 

<400>32 

Ser Gly He Ala Val Ala Leu 
15 



<210>33 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220 

<223> Description of Artificial Sequence: Synthesized 
Sequoice 

<400>33 

Asn De Leu Gin Leu Ala ne 
15 



<21034 

<211>7 

<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>34 

Ser Phe Met Ghi De Ala lie 
1 5 
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<210>35 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syndiesized 
Sequence 

<400>35 

Asn Tyr Leu Gin Phe Ala Val 
15 



<210>36 

<211>7 

<212>PRT 

'<^13> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Syndiesized 
Sequence 

<400>36 

Asn Thr Leu Gin Leu Ala De 
15 

<21037 

<211>7 

<212>PRT 

<2 1 3> Artificial Sequence 
<220 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>37 

Gly Val Leu Gin Leu Thr De 
1 5 



<210>38 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>38 

Asn Val Leu Ala Val Ala Val 
15 
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<210>39 
<211>7 

<212>PRT , 
<213> Artificial Sequence 

<220 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>39 

So: Leu Leu Gin Leu Ala Phe 
15 



<21O40 

<211>7 

<212>PRT 

<21 3> Artificial Sequence 
<220 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>40 

Ser Gly lie Ala Val Ala Val 
15 



<210>41 

<211>7 

<212>PRT 

<2 1 3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syntiiesized 
Sequence 

<400>41 

Asn Ala Leu Lys Phe Ala Met 
1 5 



<210>42 

<211>7 

<212>PRT 

<2 1 3> Artificial Sequoice 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>42 

Asn Val Leu GM Met Ala Met 
15 



■ ■ ■* ■■ ■ 
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<210>43 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Syntfiesized 
Sequence 

<400>43 

Asn Tyr Leu Gin Phe Ala ne 
15 



<21044 

<211>7 

<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequoice 

<400>44 

Asn Val Leu Gin ne Ala Val 
1 5 



<210>45 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<40O45 

Leu Asn Val Asn Val Val Thr 
I 5 



<210> 46 ^ 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequoice: Synthesized 
Sequence 

<400>46 

Val Phe Val Ghi Phe Ala ne 
15 
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<210>47 

<211>7 

<212>PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Syntfiesized 
Sequence 

<400>47 

Met Lys Leu Asn Val Ser He 
15 



<21048 

<211>7 

<212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>48 

Met Asp ne Gin Met Ser He 
15 



<210>49 

<211>7 

<212>PRT 

<2 1 3> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>49 

Leu Asn De Leu Leu So- Val 
1 5 



<210> 50 

<211>7 

<212>PRT 

<2 1 3> Artificial Seqiioice 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<40O50 

Met Asn Ue Ghi Val Ser Val 
15 
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<210>51 

<211>7 

<;212>PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>51 

Asp Ser He Asn He S» He 
1 5 



<21052 

<211>7 

<212>PRT 

<213> Artificial Sequence 

<220 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400>52 

Leu Asn Val Ghi Leu Ser Val 
15 



<210>53 
<2n>22 
<212>DNA 

<2 1 3> Artificial Sequence 
<220 

<223> Descrq}tion of Artificial Sequence: Primer 
<400> 53 

catcgctggcacaggaaggagc 22 

<210>54 
<211>24 
<212>DNA 

<213> Artificial Sequence 
<220 

<223> Description of Artificial Sequence: Primer 
<400>54 

gttggtatga cccgcatcaa tcgc 24 
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<210>55 
<211>10 
<212>PRT 

-<^13> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthesized 
Protons 

<400>55 

Asn Tbr Leu Gin Leu Ala lie lie Ser Ais 
1 5 10 

<21056 

<211>9 

<212>PRT 

<213> Artificial SequCTce 
<220> 

<223> Description of Artificial Sequence: Syntiiesized 
Proteins 

<40G> 56 

Asp Val Thr fie Thr Val Asn Gly Lys 
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